weblyzard_api Package¶
weblyzard_api
Package¶
The webLyzard API package.
Provides support for webLyzard web services. Please refer to client Module for a list of available web services.
xml_content
Module¶
Created on Feb, 27 2013
Handles the new (http://www.weblyzard.com/wl/2013#) weblyzard XML format.
- Functions added:
- support for sentence tokens and pos iterators
- Remove functions:
- compatibility fixes for namespaces, encodings etc.
- support for the old POS tags mapping.
-
class
weblyzard_api.xml_content.
LabeledDependency
(parent, pos, label)¶ Bases:
tuple
-
label
¶ Alias for field number 2
-
parent
¶ Alias for field number 0
-
pos
¶ Alias for field number 1
-
-
class
weblyzard_api.xml_content.
Sentence
(md5sum=None, pos=None, sem_orient=None, significance=None, token=None, value=None, is_title=False, dependency=None)[source]¶ Bases:
object
The sentence class used for accessing single sentences.
Note
the class provides convenient properties for accessing pos tags and tokens:
- s.sentence: sentence text
- s.tokens : provides a list of tokens (e.g. [‘A’, ‘new’, ‘day’])
- s.pos_tags: provides a list of pos tags (e.g. [‘DET’, ‘CC’, ‘NN’])
-
dependency_list
¶ Returns: the dependencies of the sentence as a list of LabeledDependency objects Return type: list
of :py:class: weblyzard_api.xml_content.LabeledDependency objects>>> s = Sentence(pos='RB PRP MD', dependency='1:SUB -1:ROOT 1:OBJ') >>> s.dependency_list [LabeledDependency(parent='1', pos='RB', label='SUB'), LabeledDependency(parent='-1', pos='PRP', label='ROOT'), LabeledDependency(parent='1', pos='MD', label='OBJ')]
-
get_dependency_list
()[source]¶ Returns: the dependencies of the sentence as a list of LabeledDependency objects Return type: list
of :py:class: weblyzard_api.xml_content.LabeledDependency objects>>> s = Sentence(pos='RB PRP MD', dependency='1:SUB -1:ROOT 1:OBJ') >>> s.dependency_list [LabeledDependency(parent='1', pos='RB', label='SUB'), LabeledDependency(parent='-1', pos='PRP', label='ROOT'), LabeledDependency(parent='1', pos='MD', label='OBJ')]
Get the POS Tags as list.
>>> sentence = Sentence(pos = 'PRP ADV NN') >>> sentence.get_pos_tags() ['PRP', 'ADV', 'NN']
Returns: list of the sentence’s POS tags >>> sentence = Sentence(pos = 'PRP ADV NN') >>> sentence.get_pos_tags_list() ['PRP', 'ADV', 'NN']
Returns: String of the sentence’s POS tags >>> sentence = Sentence(pos = 'PRP ADV NN') >>> sentence.get_pos_tags_string() 'PRP ADV NN'
-
pos_tag_string
¶ Returns: String of the sentence’s POS tags >>> sentence = Sentence(pos = 'PRP ADV NN') >>> sentence.get_pos_tags_string() 'PRP ADV NN'
Get the POS Tags as list.
>>> sentence = Sentence(pos = 'PRP ADV NN') >>> sentence.get_pos_tags() ['PRP', 'ADV', 'NN']
Returns: list of the sentence’s POS tags >>> sentence = Sentence(pos = 'PRP ADV NN') >>> sentence.get_pos_tags_list() ['PRP', 'ADV', 'NN']
-
sentence
¶
-
set_dependency_list
(dependencies)[source]¶ Takes a list of
weblyzard_api.xml_content.LabeledDependency
Parameters: dependencies (list) – The dependencies to set for this sentence. Note
The list must contain items of the type
weblyzard_api.xml_content.LabeledDependency
>>> s = Sentence(pos='RB PRP MD', dependency='1:SUB -1:ROOT 1:OBJ') >>> s.dependency_list [LabeledDependency(parent='1', pos='RB', label='SUB'), LabeledDependency(parent='-1', pos='PRP', label='ROOT'), LabeledDependency(parent='1', pos='MD', label='OBJ')] >>> s.dependency_list = [LabeledDependency(parent='-1', pos='MD', label='ROOT'), ] >>> s.dependency_list [LabeledDependency(parent='-1', pos='MD', label='ROOT')]
-
tokens
¶ Returns: an iterator providing the sentence’s tokens
-
class
weblyzard_api.xml_content.
XMLContent
(xml_content, remove_duplicates=True)[source]¶ Bases:
object
-
SUPPORTED_XML_VERSIONS
= {'deprecated': <class 'weblyzard_api.xml_content.parsers.xml_deprecated.XMLDeprecated'>, 2005: <class 'weblyzard_api.xml_content.parsers.xml_2005.XML2005'>, 2013: <class 'weblyzard_api.xml_content.parsers.xml_2013.XML2013'>}¶
-
as_dict
(mapping=None, ignore_non_sentence=False, add_titles_to_sentences=False)[source]¶ convert the XML content to a dictionary.
Parameters: - mapping – an optional mapping by which to restrict/rename the returned dictionary
- ignore_non_sentence – if true, sentences without without POS tags are omitted from the result
-
content_id
¶
-
content_type
¶
-
get_xml_document
(header_fields='all', sentence_attributes=('pos_tags', 'sem_orient', 'significance', 'md5sum', 'pos', 'token', 'dependency'), xml_version=2013)[source]¶ Parameters: - header_fields – the header_fields to include
- sentence_attributes – sentence attributes to include
- xml_version – version of the webLyzard XML format to use (XML2005.VERSION, XML2013.VERSION)
Returns: the XML representation of the webLyzard XML object
-
lang
¶
-
nilsimsa
¶
-
plain_text
¶ Returns: the plain text of the XML content
-
sentences
¶
-
title
¶
-