Try   HackMD

Analyzing texts with AntConc

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More β†’

AntConc is a free, cross-platform tool for corpus analysis. You can download AntConc for your operating system here:
http://www.laurenceanthony.net/software/antconc/

Getting familiar with AntConc

After downloading and installing AntConc, launch the program. AntConc's interface looks like this:

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More β†’

Choose 3-5 text files in your preferred language from the "Multilingual Data Set of Novels for Teaching and Research" and import them into Omeka by selecting File > Open. If you want to load every single plain-text file in a folder you can select the "Open Directory" option.

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More β†’

Keep in mind that AntConc can only load plain-text files (.txt). If you want to load other types of documents and file formats you will to convert them to plain-text first (see "Compiling" section of this guide).

Once loading the documents start with a basic word search, in the language(s) of the documents you loaded and click "Start."

In this example I used the word "country."

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More β†’

Based on my search AntConc generates a list of concordance hits on the main panel. On the left panel I can view the corpus files I uploaded. The concordance function shows the keyword I used in my seach "country" in context, in lines of text from the corpus files. This is a useful feature that can help instantly identify or reveal some patterns in the corpus. The "Concordance Plot" tab visualized the keyword in context in all its instances in each file used.

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More β†’

If you click on a black line representing an occurence of the keyword, you can see the keyword highlighted in the text.

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More β†’

Task: Collocates

Switch to the "Collocates" tab, keeping the same keyword in the search box and click "Start". Click "Ok" when you see this message:

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More β†’

A list of "collocates" (words in proximity to the search keyword) is generated:

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More β†’

You can sort collocated by frequency, overall statistical factor or alphabetically. Use File > "Export Settings to File" to export results into a plain-text file. You can also use "Clone results" to get a pop-up window with the result from which you can copy into a text editor. Use a text editor to open and view the results.

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More β†’

Task: Comparing corpora

In "Tool Preferences" in the menu add files from which you want to generate keyword lists. The files you add are going to be used as a reference corpus. This is a useful feature in cases where you want to test a small sample of text against a larget dataset or corpus. Once you select the files (they also have to bee in plain-text format) click Load > Apply. you are not ready to switch to the Keyword List tab and click "Start".

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More β†’

This will generate a list of keywords. The words listed are words in the documents initially loaded that when, compared to our reference corpus, would be statistically less expected to appear."Keyness" refers to the frequency of a word in a set of texts in relation to its frequency in a reference corpus. You can sort results by "Keyness", frequency or word. Use File > "Export Settings to File" to export results into a plain-text file. You can also use "Clone results" to get a pop-up window with the result from which you can copy into a text editor. Use a text editor to open and view the results.

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More β†’