weblyzard_api Package¶

`weblyzard_api` Package¶

The webLyzard API package.

Provides support for webLyzard web services. Please refer to client Module for a list of available web services.

`xml_content` Module¶

Created on Feb, 27 2013

Handles the new (http://www.weblyzard.com/wl/2013#) weblyzard XML format.

Functions added:

support for sentence tokens and pos iterators

Remove functions:

compatibility fixes for namespaces, encodings etc.
support for the old POS tags mapping.

class weblyzard_api.xml_content.LabeledDependency(parent, pos, label)¶

Bases: tuple

label¶: Alias for field number 2

parent¶: Alias for field number 0

pos¶: Alias for field number 1

class weblyzard_api.xml_content.Sentence(md5sum=None, pos=None, sem_orient=None, significance=None, token=None, value=None, is_title=False, dependency=None)[source]¶

Bases: object

The sentence class used for accessing single sentences.

Note

the class provides convenient properties for accessing pos tags and tokens:

s.sentence: sentence text
s.tokens : provides a list of tokens (e.g. [‘A’, ‘new’, ‘day’])
s.pos_tags: provides a list of pos tags (e.g. [‘DET’, ‘CC’, ‘NN’])

as_dict()[source]¶

Returns:	a dictionary representation of the sentence object.

dependency_list¶

Returns:	the dependencies of the sentence as a list of LabeledDependency objects
Return type:	`list` of :py:class: weblyzard_api.xml_content.LabeledDependency objects

>>> s = Sentence(pos='RB PRP MD', dependency='1:SUB -1:ROOT 1:OBJ')
>>> s.dependency_list
[LabeledDependency(parent='1', pos='RB', label='SUB'), LabeledDependency(parent='-1', pos='PRP', label='ROOT'), LabeledDependency(parent='1', pos='MD', label='OBJ')]

get_dependency_list()[source]¶

Returns:	the dependencies of the sentence as a list of LabeledDependency objects
Return type:	`list` of :py:class: weblyzard_api.xml_content.LabeledDependency objects

>>> s = Sentence(pos='RB PRP MD', dependency='1:SUB -1:ROOT 1:OBJ')
>>> s.dependency_list
[LabeledDependency(parent='1', pos='RB', label='SUB'), LabeledDependency(parent='-1', pos='PRP', label='ROOT'), LabeledDependency(parent='1', pos='MD', label='OBJ')]

get_pos_tags()[source]¶

Get the POS Tags as list.

>>> sentence = Sentence(pos = 'PRP ADV NN')
>>> sentence.get_pos_tags()
['PRP', 'ADV', 'NN']

get_pos_tags_list()[source]¶

Returns:	list of the sentence’s POS tags

>>> sentence = Sentence(pos = 'PRP ADV NN')
>>> sentence.get_pos_tags_list()
['PRP', 'ADV', 'NN']

get_pos_tags_string()[source]¶

Returns:	String of the sentence’s POS tags

>>> sentence = Sentence(pos = 'PRP ADV NN')
>>> sentence.get_pos_tags_string()
'PRP ADV NN'

get_sentence()[source]¶

get_tokens()[source]¶

Returns:	an iterator providing the sentence’s tokens

pos_tag_string¶

Returns:	String of the sentence’s POS tags

>>> sentence = Sentence(pos = 'PRP ADV NN')
>>> sentence.get_pos_tags_string()
'PRP ADV NN'

pos_tags¶

Get the POS Tags as list.

>>> sentence = Sentence(pos = 'PRP ADV NN')
>>> sentence.get_pos_tags()
['PRP', 'ADV', 'NN']

pos_tags_list¶

Returns:	list of the sentence’s POS tags

>>> sentence = Sentence(pos = 'PRP ADV NN')
>>> sentence.get_pos_tags_list()
['PRP', 'ADV', 'NN']

sentence¶

set_dependency_list(dependencies)[source]¶

Takes a list of weblyzard_api.xml_content.LabeledDependency

Parameters:	dependencies (list) – The dependencies to set for this sentence.

Note

The list must contain items of the type weblyzard_api.xml_content.LabeledDependency

>>> s = Sentence(pos='RB PRP MD', dependency='1:SUB -1:ROOT 1:OBJ')
>>> s.dependency_list
[LabeledDependency(parent='1', pos='RB', label='SUB'), LabeledDependency(parent='-1', pos='PRP', label='ROOT'), LabeledDependency(parent='1', pos='MD', label='OBJ')]
>>> s.dependency_list = [LabeledDependency(parent='-1', pos='MD', label='ROOT'), ]
>>> s.dependency_list
[LabeledDependency(parent='-1', pos='MD', label='ROOT')]

set_pos_tags(new_pos_tags)[source]¶

set_pos_tags_list(pos_tags_list)[source]¶

set_pos_tags_string(new_value)[source]¶

set_sentence(new_sentence)[source]¶

tokens¶

Returns:	an iterator providing the sentence’s tokens

class weblyzard_api.xml_content.XMLContent(xml_content, remove_duplicates=True)[source]¶

Bases: object

SUPPORTED_XML_VERSIONS = {'deprecated': <class 'weblyzard_api.xml_content.parsers.xml_deprecated.XMLDeprecated'>, 2005: <class 'weblyzard_api.xml_content.parsers.xml_2005.XML2005'>, 2013: <class 'weblyzard_api.xml_content.parsers.xml_2013.XML2013'>}¶

add_attribute(key, value)[source]¶

classmethod apply_dict_mapping(attributes, mapping=None)[source]¶

as_dict(mapping=None, ignore_non_sentence=False, add_titles_to_sentences=False)[source]¶

convert the XML content to a dictionary.

Parameters:	mapping – an optional mapping by which to restrict/rename the returned dictionary ignore_non_sentence – if true, sentences without without POS tags are omitted from the result

content_id¶

content_type¶

classmethod convert(xml_content, target_version)[source]¶

get_content_id()[source]¶

get_content_type()[source]¶

get_lang()[source]¶

get_nilsimsa()[source]¶

get_plain_text()[source]¶

Returns:	the plain text of the XML content

get_sentences()[source]¶

classmethod get_text(text)[source]¶

Returns:	the utf-8 encoded text

get_title()[source]¶

get_xml_document(header_fields='all', sentence_attributes=('pos_tags', 'sem_orient', 'significance', 'md5sum', 'pos', 'token', 'dependency'), xml_version=2013)[source]¶

Parameters:	header_fields – the header_fields to include sentence_attributes – sentence attributes to include xml_version – version of the webLyzard XML format to use (XML2005.VERSION, XML2013.VERSION)
Returns:	the XML representation of the webLyzard XML object

classmethod get_xml_version(xml_content)[source]¶

lang¶

nilsimsa¶

classmethod parse_xml_content(xml_content, remove_duplicates=True)[source]¶

plain_text¶

Returns:	the plain text of the XML content

sentences¶

title¶

update_attributes(new_attributes)[source]¶: updates the existing attributes with new ones

update_sentences(sentences)[source]¶

updates the values of the existing sentences. if the list of sentence object is empty, sentence_objects will be set to the new sentences.

Parameters:	sentences – list of Sentence objects

Warning

this function will not add new sentences

`client` Module¶

client Package

weblyzard_api Package¶