Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams
lst1=['spot','mistake']
lst1_tweets=tweets[tweets['tweet_text'].str.contains('|'.join(lst1))].reset_index()

I want to double check and have:

f=lst1_tweets['tweet_text'][0]
f='Spot the spelling mistake Welsh and Walsh. You are showing picture of presenter Bradley Walsh who is alive and kick'
type(f)
<class 'str'>

I used

f.str.contains('|'.join(lst1))

returns:

AttributeError: 'str' object has no attribute 'str'
f.contains('|'.join(lst1))

returns:

AttributeError: 'str' object has no attribute 'contains'

Any suggestions how I can search for a list of words in a string

You can only use .str.contains() on a Pandas series, not after extracting an individual string. – Barmar Nov 8, 2019 at 21:19 Here your f is referencing a Python string, whose class is named str: type(f) is str. pandas.Series.str is a different class with different attributes, including contains. You can check if a class has an attribute by a certain name (without raising an Exception, that is) with the built-in callable hasattr – BatWannaBe Nov 8, 2019 at 21:35 This could be a problem because list of strings he's searching for contains 'spot' and 'mistake', but the string he's searching in contains 'Spot' and 'mistake'. Upper-case and lower-case characters are encoded differently, so the in operator for Python strings is case sensitive, and unlike pandas.Series.str.contains, you can't make the search case-insensitive. I don't know this very well, but the | appears to be a regex character. pandas.Series.str.contains might be using the same syntax as what the Python module re does to search strings. – BatWannaBe Nov 8, 2019 at 21:48 Looking up the pandas documentation, pandas.Series.str.contains does in fact use the re module. .lower() works too, but the re module could be more familiar. – BatWannaBe Nov 8, 2019 at 21:58

You might be confusing .str.contains() from pandas, which exists and is applied to series. In this case you can use in or not in operators. Here's a full guide on how to address the issue Does Python have a string 'contains' substring method?

From pandas docs:

Series.str.contains(self, pat, case=True, flags=0, na=nan, regex=True). Test if pattern or regex is contained within a string of a Series or Index.

Not too sure if you're just checking for certain strings in a string, but i'm pretty sure .contains isn't a python thing, try this:

for "string" in f:
    # do whatever
                Also, the for-loop is assigns an object from an iterable to a variable per iteration. You can't assign objects to a string. This is likely intended to be an if-statement.
– BatWannaBe
                Nov 8, 2019 at 21:52
        

Thanks for contributing an answer to Stack Overflow!

  • Please be sure to answer the question. Provide details and share your research!

But avoid

  • Asking for help, clarification, or responding to other answers.
  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.