User Tools

Site Tools


02_browsing:05_additional:02_export

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
02_browsing:06_export [2020/04/22 11:25] – ↷ Links adapted because of a move operation simone02_browsing:05_additional:02_export [2022/06/27 09:21] (current) – external edit 127.0.0.1
Line 1: Line 1:
-====== 2.Export ====== +====== 2.5.2 Export ====== 
-After performing a query, you can click on "Moreand then "Exportto export your results. As you can see from figure 1, you have different formats available. Very often, GridExporter is your preferred Exporter.+After performing a query, you can click on ''More'' and then ''Export'' to export your results.There are five different exporters available: WekaExporter, CSVExporter, TokenExporter, GridExporter, SimpleTextExporter as can be seen in Figure 1. Each of them will be described in the next section.
  
 {{ :02_browsing:exportoptions.png?400 |}} {{ :02_browsing:exportoptions.png?400 |}}
-Figure 1: Export options +Figure 1: Different exporters and additional options for the export
- +
-Next to the type of export, you have the option "Left and right context", which is the same for all export formats. Here, you can define the number of entities to be exported to the left or right of of your search query. The entity is in the same unit as your query, i.e. if you query for tokens, you can select the number of tokens to be shown, while if you query for messages, this is the number of messages. +
- +
-The other options, "Annotation keys" and "Parameters" depend on the export format and are explained to the right when you select an export option. +
- +
-Once you click "Perform Export", the system will create the export in the memory and you can click "download" to have it downloaded to your own computer. +
- +
-Exports are very hungry in resources, thus, it might take a while to create an export or the server might even hang. The simpler your query, the less problems you have. **Tip**: instead of formulating a complex [[02_browsing:04_queries:03_regex|RegEx]] query, it might be more useful to create several simpler queries and then add the resulting files together.+
  
  
  
 ===== WekaExporter ===== ===== WekaExporter =====
-This exporter is very specific for the data mining application [[https://www.cs.waikato.ac.nz/ml/weka/|Weka]]. If you are familiar with Weka, this is a good option for you.+This exporter is very specific for the data mining application [[https://www.cs.waikato.ac.nz/ml/weka/|Weka]]. 
  
 ===== CSVExporter ===== ===== CSVExporter =====
 This exporter creates one line per result. In this line, you see the text you queried for as well as all the annotations available on the token level. Depending on the sub-corpus, these are the token itself as well as [[01_corpus:02_preprocessing:06_pos|PoS]] annotations. This exporter creates one line per result. In this line, you see the text you queried for as well as all the annotations available on the token level. Depending on the sub-corpus, these are the token itself as well as [[01_corpus:02_preprocessing:06_pos|PoS]] annotations.
  
-The field "Annotation Keys" is not used in this export. 
- 
-Under "Parameters", you can add annotations that pertain to the chat. More precisely, you can add all annotations that are listed under "Meta Annotations" in the [[02_browsing:01_sub_corpora|information display]] per sub-corpus. To list that kind of information, you use the form: //metakeys=doc// to display the chat ID. More values can be added with commas. 
  
 ===== TokenExporter ===== ===== TokenExporter =====
-This exporter is intended for smaller corpora than ours. It normally hangs even at very small queries. We recommend not to use it.+This exporter is intended for smaller corpora than ours. Using our (sub-)corpora it often hangs even at very small queries. We recommend not to use it.
  
 ===== GridExporter ===== ===== GridExporter =====
-This exporter offers is the most versatile one, since you can choose the annotations that you want to export. Figure 2 shows an example in which one token to the left and one to the right are exported as well as the whole message, the message ID, the token queried for and the age_range (not visible). Additionally, the meta key for the chat ID is exported as explained above.+This exporter is the most versatile one, since you can choose the annotations that you want to export. Figure 2 shows an example in which one token to the left and one to the right are exported as well as the whole message, the message ID, the token queried for and the age_range (not visible). Additionally, the meta key for the chat ID is exported. 
  
 {{ :02_browsing:gridexporter.png?400 |}} {{ :02_browsing:gridexporter.png?400 |}}
Line 39: Line 28:
 Figure 3: Results of the export (extract) Figure 3: Results of the export (extract)
  
-As you can see in figure 3, each result is preceded by a number starting with 0. You then see all the annotation keys selected in figure 2 in the selected order: whole message, message ID, token (your query is in the center, in this case //demain// plus the left and right token that you selected with the left and right context), age_range and then the chat ID selected with //metakeys=doc//.+As you can see in Figure 3, each result is preceded by a number starting with 0. You then see all the annotation keys selected in Figure 2 in the selected order: whole message, message ID, token (your query is in the center, in this case //demain// plus the left and right token that you selected with the left and right context), age_range and then the chat ID selected with ''metakeys=doc''.
  
 If you leave the field "Annotation keys" empty, your export contains all annotations available on the token level. Very often this is too much, so it is better to make a selection as shown above. If you leave the field "Annotation keys" empty, your export contains all annotations available on the token level. Very often this is too much, so it is better to make a selection as shown above.
 +
 +===== Simple text exporter =====
 +This exporter creates a list of the token(s) you queried for with the number of preceding and following tokens you selected in the options. The results are numbered. No additional information is exported.
 +
 +===== Additional options =====
 +Next to the type of export, you have the option “Left and right context”, which is the same for all export formats. Here, you can define the number of entities to be exported to the left or right of your search query. The entity is in the same unit as your query, i.e. if you query for tokens, you can select the number of tokens to be shown, while if you query for messages, this is the number of messages. 
 +
 +The other options, "Annotation keys" and "Parameters" depend on the export format and are explained to the right when you select an export option.
 +
 +Under “Parameters”, you can add annotations that pertain to the chat. More precisely, you can add all annotations that are listed under “Meta Annotations” in the information display per sub-corpus. To list that kind of information, you use the form: ''metakeys=doc'' to display the chat ID. More values can be added with commas.
 +
 +Once you click ''Perform Export'', the system will create the export in the memory and you can click ''download'' to have it downloaded to your own computer.
  
  
  
 +Exports are very hungry in resources, thus, it might take a while to create an export or the server might even hang. The simpler your query, the less problems you have. **Hint**: instead of formulating a complex [[02_browsing:04_queries:03_regex|RegEx]] query, it might be more useful to create several simpler queries and then merge the resulting files.
  
 + 
  
  
    
02_browsing/05_additional/02_export.1587547516.txt.gz · Last modified: 2022/06/27 09:21 (external edit)

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki