meerqat.data.infoseek module#

class meerqat.data.infoseek.QuestionType(value)[source]#

Bases: Enum

An enumeration.

String = 0#

Numerical = 1#

Time = 2#

meerqat.data.infoseek.in_range(number: float, range_list: Tuple[float, float]) → bool[source]#: Check if a number is within the specified range (inclusive).

meerqat.data.infoseek.safe_division(x: float, y: float) → float[source]#: Divide x by y, returning 0 if y is 0.

meerqat.data.infoseek.metric_numerical_range(pred: Union[float, Tuple[float, float], List[float]], answer: Union[float, Tuple[float, float], List[float]], tolerance: float = 0.1) → int[source]#

Scores numerical questions based on ranges and tolerances.

First, convert single number answer to a range with +/- tolerance.

2) If prediction is a single number, return 1 if it’s in the answer range, 0 otherwise. 3) If prediction is a range, return 1 if the range is in the answer range or if the IOU

(overlap between prediction and answer range) > 0.5, 0 otherwise.

Parameters:

pred – A list/tuple of 2 numbers or a single number.
answer – A list/tuple of 2 numbers or a single number.
tolerance – A float value for the tolerance range (default: 0.1).

Returns:

1 if conditions are met, 0 otherwise.

Return type:

int

meerqat.data.infoseek.find_numbers(string_number: str) → List[float][source]#

meerqat.data.infoseek.process_numerical_answer(string_number: str) → Union[float, List[float]][source]#

Parses numerical answer string into numbers (a single number or a range).

Clean the string and extract numbers;
if there are 2 numbers, return a range as [minimum value, maximum value]
else if there is 1 number, return a single number else return [0, 0]

Parameters:: string_number – A string representing a numerical answer.
Returns:: A single digit or a list with 2 numbers.

meerqat.data.infoseek.find_all(s: str, c: str) → Generator[int, None, None][source]#

Find all occurrences of a character in a string and return their indices.

Parameters:

s – The input string to search.
c – The character to search for.

Yields:

int – The index of the next occurrence of the character.

meerqat.data.infoseek.clean_str_range(text: str) → str[source]#

Clean range expression in a string (e.g., ‘9-10’ –> ‘9 - 10’).

Parameters:: text – The input string containing the range expression.
Returns:: The cleaned string with proper spacing around the hyphen.
Return type:: str

meerqat.data.infoseek.range_intersection_over_union(x_list: List[float], y_list: List[float]) → float[source]#: Calculate the intersection over union (IOU) of two ranges.

meerqat.data.infoseek.evaluate_quantity(quantity_pred: List[Union[float, List[float]]], quantity_answer: List[List[float]]) → List[int][source]#: Evaluate numerical predictions against numerical answers.

meerqat.data.infoseek.evaluate_entity(entity_pred: List[str], entity_answer: List[List[str]]) → List[int][source]#

Evaluate entity predictions against entity answers.

Criteria: Maximum score of exact match to entity answer.

Parameters:

entity_pred – prediction of a string
entity_answer – a list of string answer reference

Returns:

0 or 1

Return type:

List

meerqat.data.infoseek.evaluate_time(time_pred: List[str], time_answer: List[List[str]]) → List[int][source]#

Evaluate time predictions against time answers.

Criteria: 1) +/- one year –> correct 2) if asking for date, but the year is correct –> correct

Parameters:

time_pred – prediction of time
time_answer – a list of time reference

Returns:

0 or 1

Return type:

List

meerqat.data.infoseek.evaluation(predictions: List[Dict[str, Any]], qid2example: Dict[str, Dict[str, Any]]) → Tuple[List[int], List[int], List[int]][source]#

Evaluate predictions against ground truth answers.

Separate questions into time, numerical, and string categories.

Parameters:

predictions – A list of predictions.
qid2example – A mapping from question ID to ground truth examples.

Returns:

Lists of scores for time, quantity, and entity predictions.

Return type:

Tuple[List[int], List[int], List[int]]

meerqat.data.infoseek.get_results(predictions: List[Dict[str, Any]], qid2example: Dict[str, Dict[str, Any]]) → Tuple[float, float, float, float][source]#

Get evaluation scores for predictions.

Parameters:

predictions – A list of predictions.
qid2example – A mapping from question ID to ground truth examples.

Returns:

Final scores for time, quantity, entity, and overall predictions.

Return type:

Tuple[float, float, float, float]

meerqat.data.infoseek.harmonic_mean(*args: float) → float[source]#: Calculate the harmonic mean of the input arguments.

meerqat.data.infoseek.evaluate_infoseek(predictions: List[Dict[str, Any]], qid2example: Dict[str, Dict[str, Any]]) → Dict[str, float][source]#

Evaluate predictions against references.

Parameters:

predictions – A list of predictions.
qid2example – A dictionary of reference with question_id as key.

Returns:

A dictionary containing the final scores for time, quantity, entity, and overall predictions.

Return type:

Dict[str, float]

meerqat.data.infoseek.evaluate_infoseek_full(predictions: Dict[str, List[Dict[str, Any]]], qid2example: Dict[str, Dict[str, Any]]) → Dict[str, Any][source]#

meerqat.data.infoseek.fix_space(string)[source]#

meerqat.data.infoseek.evaluate(prediction_path: Union[str, List[str]], reference_path: Union[str, Dataset], do_fix_space: bool = False) → Dict[str, Any][source]#

Evaluate predictions against references.

Parameters:

prediction_path – Path to prediction file.
reference_path – Path to reference file.

Returns:

A dictionary containing the final scores for time, quantity, entity, and overall predictions.

Return type:

Dict[str, Any]

meerqat.data.infoseek.prepare_qid2example(reference: List[Dict[str, Any]]) → Dict[str, Dict[str, Any]][source]#: Convert reference to qid2example dictionary.

meerqat.data.infoseek.load_jsonl(path: str) → List[Dict[str, Any]][source]#: Load a JSONL file into a list of Dict[strionaries.

meerqat.data.infoseek.main(prediction_path: str, reference_path: str, do_fix_space: bool = False)[source]#