i have lists of nltk-tags below. select tagged 'nnp' and, more specifically, first , last names (e.g. event chair iris dankner, producer barbara schorr).
o5 = [[(u'room', 'nn'), (u'designers', 'nns'), (u',', ','), (u'bcrf', 'nnp'), (u'and', 'cc'), (u'holiday', 'nnp'), (u'house', 'nnp'), (u'staff', 'nn'), (u'cheer', 'nn'), (u'themselves', 'prp'), (u'for', 'in'), (u'a', 'dt'), (u'job', 'nn'), (u'well', 'rb'), (u'done', 'vbn')], [(u'holiday', 'nnp'), (u'house', 'nnp'), (u'founder', 'nnp'), (u'and', 'cc'), (u'event', 'nnp'), (u'chair', 'nnp'), (u'iris', 'nnp'), (u'dankner', 'nnp'), (u'with', 'in'), (u'event', 'nnp'), (u'producer', 'nnp'), (u'barbara', 'nnp'), (u'schorr', 'nnp')], [(u'architect', 'nnp'), (u'joan', 'nnp'), (u'dineen', 'nnp'), (u'with', 'in'), (u'alyson', 'nnp'), (u'liss', 'nnp')]]
here tried
o5 = [o5[i][0] o5[i][1] == "nnp"]
and
o5 = [o5[i][0] o5[i][1] = "nnp"]
both produce syntaxerror: invalid syntax
. give me suggestions here? thank you!!
first of all, not recommended name list 05.
input = [[(u'room', 'nn'), (u'designers', 'nns'), (u',', ','), (u'bcrf', 'nnp'), (u'and', 'cc'), (u'holiday', 'nnp'), (u'house', 'nnp'), (u'staff', 'nn'), (u'cheer', 'nn'), (u'themselves', 'prp'), (u'for', 'in'), (u'a', 'dt'), (u'job', 'nn'), (u'well', 'rb'), (u'done', 'vbn')], [(u'holiday', 'nnp'), (u'house', 'nnp'), (u'founder', 'nnp'), (u'and', 'cc'), (u'event', 'nnp'), (u'chair', 'nnp'), (u'iris', 'nnp'), (u'dankner', 'nnp'), (u'with', 'in'), (u'event', 'nnp'), (u'producer', 'nnp'), (u'barbara', 'nnp'), (u'schorr', 'nnp')], [(u'architect', 'nnp'), (u'joan', 'nnp'), (u'dineen', 'nnp'), (u'with', 'in'), (u'alyson', 'nnp'), (u'liss', 'nnp')]]
then can list-comprehension
result = [token[0] sent in input token in sent if token[1] == 'nnp']
to make more understandable can use
result = [] sent in input: result = result + [token[0] token in sent if token[1] == 'nnp']
if extract chunks (sequence of tokens) recommend use regular expressions.
Comments
Post a Comment