相关文章推荐
善良的鞭炮  ·  dotnet new <template> ...·  6 月前    · 
被表白的书签  ·  ubuntu卸载gitkraken·  9 月前    · 
坐怀不乱的丝瓜  ·  c# - Viewing SSRS ...·  1 年前    · 
Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

What does -2.5 and 0.92195 [-1, -3, -3, -3, -4, -3, -1, -2, -2, -3] represent?

How should i code it for a new word? Say i have to add something like '100%' , 'A1' .

  • I can also see positive and negative words txt in nltk_data\corpora\opinion_lexicon folder. How are these getting utilised? Can I add my words in these txt files too?
  • I believe that vader only uses the word and the first value when classifying text. If you want to add new words, you can simply create a dictionary of words and their sentiment values, which can be added using the update function:

    from nltk.sentiment.vader import SentimentIntensityAnalyzer
    Analyzer = SentimentIntensityAnalyser()
    Analyzer.lexicon.update(your_dictionary)
    

    You can manually assign words with sentiment values based on their perceived intensity of sentiment, or if this is impractical then you can assign a broad value across the two categories (e.g. -1.5 and 1.5).

    You can use this script (not mine) to examine if your updates have been included:

    import nltk
    from nltk.tokenize import word_tokenize, RegexpTokenizer
    from nltk.sentiment.vader import SentimentIntensityAnalyzer
    import pandas as pd
    Analyzer = SentimentIntensityAnalyzer()
    sentence = 'enter your text to test'
    tokenized_sentence = nltk.word_tokenize(sentence)
    pos_word_list=[]
    neu_word_list=[]
    neg_word_list=[]
    for word in tokenized_sentence:
        if (Analyzer.polarity_scores(word)['compound']) >= 0.1:
            pos_word_list.append(word)
        elif (Analyzer.polarity_scores(word)['compound']) <= -0.1:
            neg_word_list.append(word)
        else:
            neu_word_list.append(word)                
    print('Positive:',pos_word_list)
    print('Neutral:',neu_word_list)
    print('Negative:',neg_word_list) 
    score = Analyzer.polarity_scores(sentence)
    print('\nScores:', score)
    

    Before updating vader:

    sentence = 'stocks were volatile on Tuesday due to the recent calamities in the Chinese market'
    Positive: []
    Neutral: ['stocks', 'were', 'volatile', 'on', 'Tuesday', 'due', 'to', 'the', 'recent', 'calamities', 'in', 'the', 'Chinese', 'markets']
    Negative: []
    Scores: {'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound': 0.0}
    

    After updating vader with a finance-based lexicon:

    Analyzer.lexicon.update(Financial_Lexicon)
    sentence = 'stocks were volatile on Tuesday due to the recent calamities in the Chinese market'
    Positive: []
    Neutral: ['stocks', 'were', 'on', 'Tuesday', 'due', 'to', 'the', 'recent', 'in', 'the', 'Chinese', 'markets']
    Negative: ['volatile', 'calamities']
    Scores: {'neg': 0.294, 'neu': 0.706, 'pos': 0.0, 'compound': -0.6124}
                    Thanks @laurie . Can you also tell about if the words in my input are not present in lexicon file , there should be no score . However , I am getting positive scores for the inputs where there are no words present in lexicon txt
    – Mighty
                    Jul 31, 2018 at 5:45
                    Thats odd.. Can you provide an example? Have you used the testing-script to examine which words are being picked out?
    – Laurie
                    Jul 31, 2018 at 8:05
            

    Thanks for contributing an answer to Stack Overflow!

    • Please be sure to answer the question. Provide details and share your research!

    But avoid

    • Asking for help, clarification, or responding to other answers.
    • Making statements based on opinion; back them up with references or personal experience.

    To learn more, see our tips on writing great answers.