MLGroundTruthMixin#
- class MLGroundTruthMixin#
Bases:
object
Methods Summary
delete_groundtruth
(project_id, groundtruth_id)Delete Ground Truth
delete_groundtruth_label
(project_id, ...)Delete labeled extract
delete_groundtruth_rule
(project_id, ...)Delete rule
get_groundtruth
(project_id, groundtruth_id)Get a single Ground Truth.
get_groundtruth_item
(project_id, ...[, ...])Returns a item of the provided project enriched with Ground Truth data.
get_groundtruth_items
(project_id, groundtruth_id)Returns items for the provided project enriched with Ground Truth data.
get_groundtruth_label
(project_id, ...)Get a single labeled extract from a Ground Truth.
get_groundtruth_labels
(project_id, ...[, ...])Return the labeled extract of a ground truth for a project in a list.
get_groundtruth_labels_batched
(project_id, ...)Returns a generator that goes through all the valid labeled extracts from the provided groundtruth.
get_groundtruth_rule
(project_id, ...)Get a single rule of the Ground Truth.
get_groundtruth_rules
(project_id, groundtruth_id)Get all rules for the Ground Truth.
get_groundtruths
(project_id)Return all ground truth for a project in a list.
modify_groundtruth
(project_id, groundtruth_id)Modify an existing Ground Truth.
modify_groundtruth_label
(project_id, ...[, ...])Modify an existing labeled extract.
modify_groundtruth_rule
(project_id, ...)Modify an existing rule.
new_groundtruth
(project_id, name, config)Create a new Ground Truth.
new_groundtruth_label
(project_id, ...)Create a new labeled extract.
new_groundtruth_labels
(project_id, ...)Create multiple labeled extracts.
new_groundtruth_rule
(project_id, ...)Create a new rule in Ground Truth.
Methods Documentation
- delete_groundtruth(project_id, groundtruth_id)#
Delete Ground Truth
- Parameters:
project_id – Id of the Squirro project.
groundtruth_id – Id of the Ground Truth.
- delete_groundtruth_label(project_id, groundtruth_id, label_id)#
Delete labeled extract
- Parameters:
project_id – Id of the Squirro project.
groundtruth_id – Id of the Ground Truth.
label_id – Id of the labeled extract
- delete_groundtruth_rule(project_id, groundtruth_id, rule_id)#
Delete rule
- Parameters:
project_id – Id of the Squirro project.
groundtruth_id – Id of the Ground Truth.
rule_id – Id of the rule
- get_groundtruth(project_id, groundtruth_id)#
Get a single Ground Truth.
- Parameters:
project_id – Id of the Squirro project.
groundtruth_id – Id of the GroundTruth
Examples
Get a single groundtruth:
>>> client.get_groundtruth( project_id="DSuNrcnlSc6x5SJZh02IyQ", groundtruth_id="n57sJlpcTLy4_XxdvHvx5g") {'bulk_labeling_status': 'no bulk labelings', 'config': {'candidatesets': [], 'labels': ['yes', 'no']}, 'created_at': '1996-03-13T00:00:00', 'id': 'n57sJlpcTLy4_XxdvHvx5g', 'modified_at': '2015-06-09T00:00:00', 'name': 'Commercial final fly share white focus voice.', 'project_id': 'DSuNrcnlSc6x5SJZh02IyQ', 'bulk_labeling_status': 'no bulk labelings' 'rules': {}}
- get_groundtruth_item(project_id, groundtruth_id, item_id, highlight_query='', user_id=None, temporal_version='2024-11-21T06:12:59.427676', label=None, include_sentences=False)#
Returns a item of the provided project enriched with Ground Truth data.
- Parameters:
project_id – Id of the Squirro project
groundtruth_id – Id of the GroundTruth
item_id – Id of the item
highlight_query – query containing highlight information
user_id – Id of the user to filter Ground Truth by
temporal_version – Temporal version of the Ground Truth
label – Label tag to filter Ground Truth by
include_sentences – Flag to return documents split in sentences
- Returns:
- get_groundtruth_items(project_id, groundtruth_id, user_id=None, temporal_version=None, label=None, labelled_filter=None, highlight_filter=False, **kwargs)#
Returns items for the provided project enriched with Ground Truth data.
- Parameters:
project_id – Id of the Squirro project
groundtruth_id – Id of the GroundTruth
user_id – Id of the user to filter Ground Truth by
temporal_version – temporal version of the Ground Truth
label – label to filter Ground Truth by
labelled_filter – filter if all items, only the already labelled or only the unlabelled items should get returned (accepted values:’all’,’labelled’ and ‘not_labelled’)
:param highlight_filter : filter if only the highlighted items should get returned :type kwargs: :param kwargs: Additional query parameters. All keyword arguments are
passed on verbatim to the API.
- Returns:
- get_groundtruth_label(project_id, groundtruth_id, label_id)#
Get a single labeled extract from a Ground Truth.
- Parameters:
project_id – Id of the Squirro project.
groundtruth_id – Id of the GroundTruth
label_id – Id of the labeled extract
- get_groundtruth_labels(project_id, groundtruth_id, user_id=None, temporal_version=None, label=None, extract_query=None, item_ids=[], count=None, start=None)#
Return the labeled extract of a ground truth for a project in a list. Note: to avoid timeout issues in large ground truths, use get_groundtruth_labels_batched instead.
- Parameters:
project_id – Id of the Squirro project.
groundtruth_id – Id of the GroundTruth
user_id – Id of the user to filter Ground Truth by
temporal_version – temporal version of the Ground Truth
label – label to filter Ground Truth by
- Param:
item_ids: item_ids to filter Ground Truth by
- Param:
count: num of elements to retrieve of the Ground Truth
- Param:
start: pagination offset for the retrieval of the Ground Truth
- get_groundtruth_labels_batched(project_id, groundtruth_id, batch_size=1000, temporal_version=None)#
Returns a generator that goes through all the valid labeled extracts from the provided groundtruth. The generator is only valid for a short period of time, as it uses Elasticsearch’s PIT API under the hood. It is preferable to use this method instead of get_groundtruth_labels when the number of labels is large, as it avoids timeout issues.
- Parameters:
project_id (str) – The ID of the project.
groundtruth_id (str) – The ID of the groundtruth.
batch_size (int, optional) – The size of the batch requested to the API. Defaults to 1000. If time out issues are encountered, try to reduce this value.
temporal_version (str, optional) – The datetime string corresponding to the Ground Truth temporal version. Defaults to None.
- Yields:
dict – A dictionary containing information about a labeled extract, including the extract text, keywords, item ID, etc.
- get_groundtruth_rule(project_id, groundtruth_id, rule_id)#
Get a single rule of the Ground Truth.
- Parameters:
project_id – Id of the Squirro project.
groundtruth_id – Id of the GroundTruth
rule_id – Id of the rule
- get_groundtruth_rules(project_id, groundtruth_id)#
Get all rules for the Ground Truth.
- Parameters:
project_id – Id of the Squirro project
groundtruth_id – Id of the GroundTruth
- get_groundtruths(project_id)#
Return all ground truth for a project in a list.
- Parameters:
project_id – Id of the Squirro project.
- modify_groundtruth(project_id, groundtruth_id, name=None, config=None)#
Modify an existing Ground Truth.
- Parameters:
project_id – Id of the Squirro project.
groundtruth_id – Id of the Ground Truth.
name – Name of the Ground Truth.
config – Dictionary of Ground Truth config.
- modify_groundtruth_label(project_id, groundtruth_id, label_id, validity, label=None)#
Modify an existing labeled extract.
- Parameters:
project_id – Id of the Squirro project
groundtruth_id – Id of the Ground Truth
label_id – Id of the labeled extract
validity – validity of the labeled extract
label – label of the labeled extract
- modify_groundtruth_rule(project_id, groundtruth_id, rule_id, rule)#
Modify an existing rule.
- Parameters:
project_id – Id of the Squirro project.
groundtruth_id – Id of the Ground Truth.
rule_id – Id of the rule
rule – information of the rule.
- new_groundtruth(project_id, name, config)#
Create a new Ground Truth.
- Parameters:
project_id – Id of the Squirro project.
name – Name of the Ground Truth.
config – Ground Truth Config.
- new_groundtruth_label(project_id, groundtruth_id, label)#
Create a new labeled extract.
- Parameters:
project_id – Id of the Squirro project.
groundtruth_id – Id of the Ground Truth
label – information of the labeled extract.
- new_groundtruth_labels(project_id, groundtruth_id, labels)#
Create multiple labeled extracts.
- Parameters:
project_id – Id of the Squirro project.
groundtruth_id – Id of the Ground Truth.
labels – list of dicts, where each dict contains information of a labeled extract.
- new_groundtruth_rule(project_id, groundtruth_id, rule)#
Create a new rule in Ground Truth.
- Parameters:
project_id – Id of the Squirro project.
groundtruth_id – Id of the Ground Truth
rule – information of the rule.