meerqat.data.infoseek module#

class meerqat.data.infoseek.QuestionType(value)[source]#

Bases: Enum

An enumeration.

String = 0#
Numerical = 1#
Time = 2#
meerqat.data.infoseek.in_range(number: float, range_list: Tuple[float, float]) bool[source]#

Check if a number is within the specified range (inclusive).

meerqat.data.infoseek.safe_division(x: float, y: float) float[source]#

Divide x by y, returning 0 if y is 0.

meerqat.data.infoseek.metric_numerical_range(pred: Union[float, Tuple[float, float], List[float]], answer: Union[float, Tuple[float, float], List[float]], tolerance: float = 0.1) int[source]#

Scores numerical questions based on ranges and tolerances.

  1. First, convert single number answer to a range with +/- tolerance.

2) If prediction is a single number, return 1 if it’s in the answer range, 0 otherwise. 3) If prediction is a range, return 1 if the range is in the answer range or if the IOU

(overlap between prediction and answer range) > 0.5, 0 otherwise.

Parameters:
  • pred – A list/tuple of 2 numbers or a single number.

  • answer – A list/tuple of 2 numbers or a single number.

  • tolerance – A float value for the tolerance range (default: 0.1).

Returns:

1 if conditions are met, 0 otherwise.

Return type:

int

meerqat.data.infoseek.find_numbers(string_number: str) List[float][source]#
meerqat.data.infoseek.process_numerical_answer(string_number: str) Union[float, List[float]][source]#

Parses numerical answer string into numbers (a single number or a range).

  1. Clean the string and extract numbers;

  2. if there are 2 numbers, return a range as [minimum value, maximum value]

    else if there is 1 number, return a single number else return [0, 0]

Parameters:

string_number – A string representing a numerical answer.

Returns:

A single digit or a list with 2 numbers.

meerqat.data.infoseek.find_all(s: str, c: str) Generator[int, None, None][source]#

Find all occurrences of a character in a string and return their indices.

Parameters:
  • s – The input string to search.

  • c – The character to search for.

Yields:

int – The index of the next occurrence of the character.

meerqat.data.infoseek.clean_str_range(text: str) str[source]#

Clean range expression in a string (e.g., ‘9-10’ –> ‘9 - 10’).

Parameters:

text – The input string containing the range expression.

Returns:

The cleaned string with proper spacing around the hyphen.

Return type:

str

meerqat.data.infoseek.range_intersection_over_union(x_list: List[float], y_list: List[float]) float[source]#

Calculate the intersection over union (IOU) of two ranges.

meerqat.data.infoseek.evaluate_quantity(quantity_pred: List[Union[float, List[float]]], quantity_answer: List[List[float]]) List[int][source]#

Evaluate numerical predictions against numerical answers.

meerqat.data.infoseek.evaluate_entity(entity_pred: List[str], entity_answer: List[List[str]]) List[int][source]#

Evaluate entity predictions against entity answers.

Criteria: Maximum score of exact match to entity answer.

Parameters:
  • entity_pred – prediction of a string

  • entity_answer – a list of string answer reference

Returns:

0 or 1

Return type:

List

meerqat.data.infoseek.evaluate_time(time_pred: List[str], time_answer: List[List[str]]) List[int][source]#

Evaluate time predictions against time answers.

Criteria: 1) +/- one year –> correct 2) if asking for date, but the year is correct –> correct

Parameters:
  • time_pred – prediction of time

  • time_answer – a list of time reference

Returns:

0 or 1

Return type:

List

meerqat.data.infoseek.evaluation(predictions: List[Dict[str, Any]], qid2example: Dict[str, Dict[str, Any]]) Tuple[List[int], List[int], List[int]][source]#

Evaluate predictions against ground truth answers.

Separate questions into time, numerical, and string categories.

Parameters:
  • predictions – A list of predictions.

  • qid2example – A mapping from question ID to ground truth examples.

Returns:

Lists of scores for time, quantity, and entity predictions.

Return type:

Tuple[List[int], List[int], List[int]]

meerqat.data.infoseek.get_results(predictions: List[Dict[str, Any]], qid2example: Dict[str, Dict[str, Any]]) Tuple[float, float, float, float][source]#

Get evaluation scores for predictions.

Parameters:
  • predictions – A list of predictions.

  • qid2example – A mapping from question ID to ground truth examples.

Returns:

Final scores for time, quantity, entity, and overall predictions.

Return type:

Tuple[float, float, float, float]

meerqat.data.infoseek.harmonic_mean(*args: float) float[source]#

Calculate the harmonic mean of the input arguments.

meerqat.data.infoseek.evaluate_infoseek(predictions: List[Dict[str, Any]], qid2example: Dict[str, Dict[str, Any]]) Dict[str, float][source]#

Evaluate predictions against references.

Parameters:
  • predictions – A list of predictions.

  • qid2example – A dictionary of reference with question_id as key.

Returns:

A dictionary containing the final scores for time, quantity, entity, and overall predictions.

Return type:

Dict[str, float]

meerqat.data.infoseek.evaluate_infoseek_full(predictions: Dict[str, List[Dict[str, Any]]], qid2example: Dict[str, Dict[str, Any]]) Dict[str, Any][source]#
meerqat.data.infoseek.fix_space(string)[source]#
meerqat.data.infoseek.evaluate(prediction_path: Union[str, List[str]], reference_path: Union[str, Dataset], do_fix_space: bool = False) Dict[str, Any][source]#

Evaluate predictions against references.

Parameters:
  • prediction_path – Path to prediction file.

  • reference_path – Path to reference file.

Returns:

A dictionary containing the final scores for time, quantity, entity, and overall predictions.

Return type:

Dict[str, Any]

meerqat.data.infoseek.prepare_qid2example(reference: List[Dict[str, Any]]) Dict[str, Dict[str, Any]][source]#

Convert reference to qid2example dictionary.

meerqat.data.infoseek.load_jsonl(path: str) List[Dict[str, Any]][source]#

Load a JSONL file into a list of Dict[strionaries.

meerqat.data.infoseek.main(prediction_path: str, reference_path: str, do_fix_space: bool = False)[source]#