Navid Mamoon
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Make a copy
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Make a copy Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    # Text Analysis Framework This framework provides the ability to import documents from various sources, perform analyses on them, and display the analytics as visualizations with the use of three plugin types, described in detail below: `DataPlugin`, `AnalysisPlugin`, and `DisplayPlugin`. ## Usage 1. Use `gradle run` in the `plugins` directory to start the program. 2. Once on the GUI, you will notice a list of buttons to import documents on the top right. 3. Click a button to import documents. Once imported, you can select one or multiple documents from the document pane. 4. Finally, to display your documents and their analytics, you can select a visualization on the bottom right. ## Screenshots ![Importing tweets from Twitter](https://i.imgur.com/qZEGQDI.png) ![Creating a sentiment analysis time series](https://i.imgur.com/c6EuRqS.png) ![Checking analytics of different news media](https://i.imgur.com/L9ghJGa.png) ![Comparing word clouds of presidential candidates](https://i.imgur.com/2E23loR.png) ## Important Classes ### Document A `Document` class is the primary unit to store a body of plain text. Beyond the mandatory text field, a document can store more information by having certain attributes attached to it, such as a document's `title`, `author`, or even complex analysis such as its `word_count` or `sentiment_score`. In the GUI, a Document's name (refer to `toString`) will simply be a preview of the text *unless* the Document contains a `title` attribute, in which the value of that attribute will be used instead. ### DataPlugin Data plugins are in charge of adding `Document` classes to the framework by accepting user input and parsing the relevant source of data into plain text. If applicable, data plugins can also add certain attributes in the import step. Examples: `LocalFilePlugin`, `TwitterPlugin`. ### AnalysisPlugin Analysis plugins are the heart of the framework; they can take in any `Document`, analyze the text, and attach relevant attributes to the document. The framework attaches many useful analytics out of the box: * Word/Character/Sentence Count * Word Frequency * Sentiment Analysis (Score, Magnitude, Description) * Readability (Automated Readability Index, Flesch-Kincaid Grade Level) For examples on `AnalysisPlugin`, you can check out the built-in classes that implement it in `framework/analysis` (such as `WordFreqAnalysis` and `SentimentAnalysis`). ### DisplayPlugin Combined with attributes, display plugins are a powerful tool to create visualizations for a collection of documents. They can prompt the user for custom input as well as allow them to select from attributes in the framework that match a certain type of data (more below), allowing for type safe creation of data-driven charts and more. Examples: `Table`, `WordCloud`. ### DataType To facilitate data flow, the custom interface `DataType` opens up a whole world of extensibility. Internally, all documents store their attribute values as Strings. This class functions to keep track of which attributes are meant to store what type of value. It also provides the ability to check user inputs against a plugin's required parameter types. Each data type has a `canParse(String)`, which returns true if the provided value can successfully be parsed into the expected type and false otherwise. Feel free to use the pre-defined types in the enum `CoreType`: * `STRING` * `BOOLEAN`* * `INTEGER` * `NUMERICAL` * `DATE` * `FILE`* * `COLOR`* * `ANY` \* *These types have custom integration in our packaged GUI.* Taking it a step further, `DataType` can be implemented by developers to create and display custom, advanced types, from flattened JSON maps to text manipulation. Take a look at the example `WordFreqType.WORD_FREQ_MAP` for an example on how to create and parse an advanced type. ### Parameter The `Parameter` class is a container class that contains a string `displayName`, (a user-friendly identifer meant to be used by GUIs) and a data type `type`. This container provides two useful functions throughout the framework: 1. To facilitate user input through the framework interface's `getParams` method. A GUI can take advantage of the parameter's `type` and display different input methods, even though internally it will all boil down to a `String`. 2. To tell the framework which attributes are provided by an `AnalysisPlugin` (described in detail below). ### Attribute The `Attribute` class is an immutable class that can be attached to Documents to store information about those Documents. This "attaching" may occur when the Document is first imported OR when an analysis is run on a Document. Each `Attribute` has a `displayName` that is used in the GUI, a `value` that contains the stored information (as a String), and a `type` that describes what `DataType` the value String must satisfy. This `type` is also essential as when a `DisplayPlugin` wishes to specify a certain type (for example, a line chart plugin requesting a `CoreType.NUMERICAL` for the y-axis), the framework will only return compatible attributes. ### DocumentCollection The `DocumentCollection` class is simply a container class that pairs a `String` display name to a `Collection<Document>` collection of documents. Essentially, the name of a directory allowing the user to select multiple documents in the UI at once by simply selecting the given display name. If a `DocumentCollection` contains only one document, the GUI will instead defer to the name (`toString`) of the document. This means it is safe for a plugin to use `null` or an empty value for the name of a singleton `DocumentCollection`. ## Creating your own plugins ### Data Plugin * Create your key constants and a map that links the keys to parameters, as shown below. * The first argument passed to the `Parameter` constructor is the prompt label that the framework will display to the user. * The use of a `LinkedHashMap` ensures that the framework will display the parameters to the user in the order you define. ```java private static final String PLUGIN_NAME = "Twitter", USER_KEY = "username", NUM_KEY = "numTweet", private final Map<String, Parameter> parameters = new LinkedHashMap<>(){{ put(USER_KEY, new Parameter("Twitter Handle (e.g. @joshbloch)", CoreType.STRING)); put(NUM_KEY, new Parameter("Number of Tweets", CoreType.INTEGER)); put(INCLUDE_RTS, new Parameter("Display Retweets?", CoreType.BOOLEAN)); }}; ``` * Return the map created in the `getParameters()` function * The framework calls the `getParameters()` function to understand what user inputs are required by the plugin and what data types are expected by each parameters. ```java @Override public Map<String, Parameter> getParameters() { return parameters; } ``` * In the `getDocument` function, the plugin should first ensure that the `paramMap` given has all the keys you expect. Throw an IllegalArgumentException if there are missing keys. ```java // Make sure all parameters are provided for (String key : parameters.keySet()) if (!paramMap.containsKey(key)) throw new IllegalArgumentException("Missing parameter: " + key); ``` * In the `getDocument` function, the plugin should parse the `paramMap` argument to extract the user inputs it needs. The framework already ensures the user input is of the specified type, thus we can safely parse the input without worries. ```java String username = paramMap.get(USER_KEY); int numTweet = Integer.parseInt(paramMap.get(NUM_KEY)); ``` * The `getDocument` function should create a `DocumentCollection` object to return. Like so: `return new DocumentCollection(collectionName, docs);` * The collectionName will be used by the framework as a title to display this collection of documents * the docs is a collection of documents * The `toString` method provides the name of the plugin, this name should be unique among all the data plugins. ### Display Plugin * Create a `<String, Parameter>` map and a `<String, DataType>` map. * The `Parameter` map specifies any user inputs (e.g. "Chart Title", "Number of Columns") * The `DataType` map specifies which attributes the plugin requires and the types of those attributes (e.g. "y-axis" may require `CoreType.NUMERICAL`) * Note: It is possible for a plugin to not require parameters, attributes, or both. In this case, the maps should be empty rather than `null` * Return the `DataType` map in `getDataTypes` function, and the `Parameter` map in `getUserParameters` * The framework calls `getDataTypes` to create a list of compatible attributes by indexing each document's available attributes and the plugins loaded. * The framework calls the `getUserParameters` to understand user input parameters needed by the DisplayPlugin. ```java private static final TITLE_KEY = "title", COLUMN_ONE = "Attribute 1", COLUMN_TWO = "Attribute 2 (optional)", COLUMN_THREE = "Attribute 3 (optional)"; // The first argument given to Parameter will be displayed. private final Map<String, Parameter> userParameterMap = Map.of( TITLE_KEY, new Parameter("Table Name (optional)", CoreType.STRING) ); // The key of the maps will be displayed as well. // Using a LinkedHashMap preserves order in the GUI later. private final Map<String, DataType> dataTypeMap = new LinkedHashMap<>(){{ put(COLUMN_ONE, CoreType.ANY); put(COLUMN_TWO, CoreType.ANY); put(COLUMN_THREE, CoreType.ANY); }}; ``` * In the `visualize` function, the plugin should first parse the expected inputs like below. * The `col1`, `col2`, and `col3` are the data inputs. They contain the keys that the plugin can use to obtain a specific Attribute in a Document. * You can access the Attribute like so: `Attribute attr = document.getAttribute(col1)`, and you can get the value from the attribute. ```java String col1 = parameters.get(COLUMN_ONE), col2 = parameters.get(COLUMN_TWO), col3 = parameters.get(COLUMN_THREE); ``` * After obtaining all the values you needed, simply create the visualization in a `JFrame` and return that `JFrame` to the framework! The framework will handle displaying the `JFrame` (e.g. no need to use `setVisible(true)` on the frame). ### Analysis Plugin If you wish to perform other analysis that interests you, you may implement your own analysis plugin following these steps: * Create a `Collection<Parameter>` and return it in the `getParameters()` function. * The Framework calls `getParameters()` to understand what Attributes (and importantly, their DataTypes) this plugin will add to the Documents upon calling `analyze` * The Parameter's display name should also be the key when calling `putAttribute` on the Documents * The Parameter's type allows the Framework to show this Attribute as selectable for appropriate DisplayPlugins * Think about how you will store your information and use a CoreType if possible. If not, you can implement your own DataType as explained above ```java private static final String PLUGIN_NAME = "Sentiment Analysis", SCORE_NAME = "Sentiment Score", MAG_NAME = "Sentiment Magnitude", TAG_NAME = "Description of Sentiment"; // Collection of all the attributes this plugin adds private final Collection<Parameter> attributeList = List.of( new Parameter(SCORE_NAME, CoreType.NUMERICAL), new Parameter(MAG_NAME, CoreType.NUMERICAL), new Parameter(TAG_NAME, CoreType.STRING) ); @Override public Collection<Parameter> getParameters() { return paramList; } ``` * In the `analyze` function, the plugin should perform the analysis and attach all the promised Attributes to each Document in the Collection given. * Use the function `putAttribute(key, Attribute)` on the Document to add the Attribute to the document * To avoid doing double work, feel free to check if the Document already has the desired Attributes before performing computation since Documents retain Attributes throughout their lifetime in the framework (especially if the analysis is expensive) * Be sure that ALL of the promised Attributes (from `getParameters`) are present on ALL of the Documents when `analyze` completes, otherwise you will get an error ```java @Override public void analyze(Collection<Document> documents) { for (Document doc : documents) { Sentiment sentiment = null; // Only perform API call if the document does not already have the attributes. if (!doc.hasAttributes(Set.of(SCORE_NAME,MAG_NAME,TAG_NAME))) sentiment = getSentiment(doc.getText()); if (sentiment != null) { // attaching attributes to document String scoreStr = Float.toString(sentiment.getScore()); String magStr = Float.toString(sentiment.getMagnitude()); String tagStr = sentimentTag(sentiment.getScore(),sentiment.getMagnitude()); Attribute scoreAttr = new Attribute(SCORE_NAME, CoreType.NUMERICAL,scoreStr); Attribute magAttr = new Attribute(MAG_NAME, CoreType.NUMERICAL,magStr); Attribute tagAttr = new Attribute(TAG_NAME, CoreType.STRING,tagStr); doc.putAttribute(SCORE_NAME,scoreAttr).putAttribute(MAG_NAME,magAttr).putAttribute(TAG_NAME,tagAttr); } } } ``` * That's it! You should now be able to add your Analysis Plugin to the file: resources\META-INF\services\edu.cmu.cs.cs214.hw5.framework.core.AnalysisPlugin ## Getting Plugin Keys Some plugins may access third-party APIs that require credentials. The following plugins are listed here. ### SentimentAnalysis (Google NLP) * Go to the [Google Cloud Documentation](https://cloud.google.com/docs/authentication/production#manually) * Follow the instructions under "Passing credentials manually" 1. Create a service account 2. Obtain service account credentials file (`.json`) 3. Set your `GOOGLE_APPLICATION_CREDENTIALS` environment variable to the file path of your credentials file ### TwitterPlugin (TwitterAPI) * Go to the [Twitter Developer Portal](https://developer.twitter.com/en/apply-for-access) and apply for a developer account * Enter the portal and find your credentials * Create a `twitterConfig.json` file such as below: ```json { "consumerKey": your API Key, "consumerSecret": API Secret, "accessToken": Access Token, "accessTokenSecret": Access Secret } ``` * Lastly, place the `twitterConfig.json` you created into relative file path`src/main/resources/config/`. In other words, the TwitterPlugin expects to read your API key in path `src/main/resources/config/twitterConfig.json`

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully