nlp - Selecting and concatenating two (or more) strings based on NLTK-tags -


i have lists of nltk-tags below. select tagged 'nnp' and, more specifically, first , last names (e.g. event chair iris dankner, producer barbara schorr).

o5 = [[(u'room', 'nn'), (u'designers', 'nns'), (u',', ','), (u'bcrf', 'nnp'), (u'and', 'cc'), (u'holiday', 'nnp'), (u'house', 'nnp'), (u'staff', 'nn'), (u'cheer', 'nn'), (u'themselves', 'prp'), (u'for', 'in'), (u'a', 'dt'), (u'job', 'nn'), (u'well', 'rb'), (u'done', 'vbn')], [(u'holiday', 'nnp'), (u'house', 'nnp'), (u'founder', 'nnp'), (u'and', 'cc'), (u'event', 'nnp'), (u'chair', 'nnp'), (u'iris', 'nnp'), (u'dankner', 'nnp'), (u'with', 'in'), (u'event', 'nnp'), (u'producer', 'nnp'), (u'barbara', 'nnp'), (u'schorr', 'nnp')], [(u'architect', 'nnp'), (u'joan', 'nnp'), (u'dineen', 'nnp'), (u'with', 'in'), (u'alyson', 'nnp'), (u'liss', 'nnp')]] 

here tried

o5 = [o5[i][0] o5[i][1] ==  "nnp"] 

and

o5 = [o5[i][0] o5[i][1] =  "nnp"] 

both produce syntaxerror: invalid syntax. give me suggestions here? thank you!!

first of all, not recommended name list 05.

input = [[(u'room', 'nn'), (u'designers', 'nns'), (u',', ','), (u'bcrf', 'nnp'), (u'and', 'cc'), (u'holiday', 'nnp'), (u'house', 'nnp'), (u'staff', 'nn'), (u'cheer', 'nn'), (u'themselves', 'prp'), (u'for', 'in'), (u'a', 'dt'), (u'job', 'nn'), (u'well', 'rb'), (u'done', 'vbn')], [(u'holiday', 'nnp'), (u'house', 'nnp'), (u'founder', 'nnp'), (u'and', 'cc'), (u'event', 'nnp'), (u'chair', 'nnp'), (u'iris', 'nnp'), (u'dankner', 'nnp'), (u'with', 'in'), (u'event', 'nnp'), (u'producer', 'nnp'), (u'barbara', 'nnp'), (u'schorr', 'nnp')], [(u'architect', 'nnp'), (u'joan', 'nnp'), (u'dineen', 'nnp'), (u'with', 'in'), (u'alyson', 'nnp'), (u'liss', 'nnp')]] 

then can list-comprehension

result = [token[0] sent in input token in sent if token[1] == 'nnp'] 

to make more understandable can use

result = [] sent in input:     result = result + [token[0] token in sent if token[1] == 'nnp'] 

if extract chunks (sequence of tokens) recommend use regular expressions.


Comments