Expected string or buffer error while splitting in python

I am trying to split a document into paragraph first and then the paragraph into lines. Then check for the lines and print the paragraph.

Although I am able to achieve that with the code below, there is some 'expected string or buffer' error that shows up when I am trying to do the same for multiple documents.

with io.open(input_path, mode='r') as f, io.open(write_path, mode='w') as f2:
    data = f.read()
    splat = re.split(r"\n(\s)*\n", data)
    mylist=[]
    for para1 in splat:
        splat2= re.split(r"\n", para1)
        for line1 in splat2:
           PERFORM SOME OPERATION

Error

<ipython-input-218-18e633df1d46> in custom_section(input_path, write_path)
     14         mylist=[]
     15         for para1 in splat:
---> 16             splat2= re.split(r"\n", para1)
     17             for line1 in splat2:
     18 #                 line1 = line1.decode("utf-8")

/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.pyc in split(pattern, string, maxsplit, flags)
    169     """Split the source string by the occurrences of the pattern,
    170     returning a list containing the resulting substrings."""
--> 171     return _compile(pattern, flags).split(string, maxsplit)
    172 
    173 def findall(pattern, string, flags=0):

TypeError: expected string or buffer

1 answer

  • answered 2018-04-17 05:43 Sjshovan

    I believe this error is occurring because the list of strings returned as your variable splat contains one or more None objects. If you insist on using re.split() you could remove the None objects with the filter() function, like so: filter(None, splat).