meerqat.data.infoseek module#
- class meerqat.data.infoseek.QuestionType(value)[source]#
Bases:
Enum
An enumeration.
- String = 0#
- Numerical = 1#
- Time = 2#
- meerqat.data.infoseek.in_range(number: float, range_list: Tuple[float, float]) bool [source]#
Check if a number is within the specified range (inclusive).
- meerqat.data.infoseek.safe_division(x: float, y: float) float [source]#
Divide x by y, returning 0 if y is 0.
- meerqat.data.infoseek.metric_numerical_range(pred: Union[float, Tuple[float, float], List[float]], answer: Union[float, Tuple[float, float], List[float]], tolerance: float = 0.1) int [source]#
Scores numerical questions based on ranges and tolerances.
First, convert single number answer to a range with +/- tolerance.
2) If prediction is a single number, return 1 if it’s in the answer range, 0 otherwise. 3) If prediction is a range, return 1 if the range is in the answer range or if the IOU
(overlap between prediction and answer range) > 0.5, 0 otherwise.
- Parameters:
pred – A list/tuple of 2 numbers or a single number.
answer – A list/tuple of 2 numbers or a single number.
tolerance – A float value for the tolerance range (default: 0.1).
- Returns:
1 if conditions are met, 0 otherwise.
- Return type:
int
- meerqat.data.infoseek.process_numerical_answer(string_number: str) Union[float, List[float]] [source]#
Parses numerical answer string into numbers (a single number or a range).
Clean the string and extract numbers;
- if there are 2 numbers, return a range as [minimum value, maximum value]
else if there is 1 number, return a single number else return [0, 0]
- Parameters:
string_number – A string representing a numerical answer.
- Returns:
A single digit or a list with 2 numbers.
- meerqat.data.infoseek.find_all(s: str, c: str) Generator[int, None, None] [source]#
Find all occurrences of a character in a string and return their indices.
- Parameters:
s – The input string to search.
c – The character to search for.
- Yields:
int – The index of the next occurrence of the character.
- meerqat.data.infoseek.clean_str_range(text: str) str [source]#
Clean range expression in a string (e.g., ‘9-10’ –> ‘9 - 10’).
- Parameters:
text – The input string containing the range expression.
- Returns:
The cleaned string with proper spacing around the hyphen.
- Return type:
str
- meerqat.data.infoseek.range_intersection_over_union(x_list: List[float], y_list: List[float]) float [source]#
Calculate the intersection over union (IOU) of two ranges.
- meerqat.data.infoseek.evaluate_quantity(quantity_pred: List[Union[float, List[float]]], quantity_answer: List[List[float]]) List[int] [source]#
Evaluate numerical predictions against numerical answers.
- meerqat.data.infoseek.evaluate_entity(entity_pred: List[str], entity_answer: List[List[str]]) List[int] [source]#
Evaluate entity predictions against entity answers.
Criteria: Maximum score of exact match to entity answer.
- Parameters:
entity_pred – prediction of a string
entity_answer – a list of string answer reference
- Returns:
0 or 1
- Return type:
List
- meerqat.data.infoseek.evaluate_time(time_pred: List[str], time_answer: List[List[str]]) List[int] [source]#
Evaluate time predictions against time answers.
Criteria: 1) +/- one year –> correct 2) if asking for date, but the year is correct –> correct
- Parameters:
time_pred – prediction of time
time_answer – a list of time reference
- Returns:
0 or 1
- Return type:
List
- meerqat.data.infoseek.evaluation(predictions: List[Dict[str, Any]], qid2example: Dict[str, Dict[str, Any]]) Tuple[List[int], List[int], List[int]] [source]#
Evaluate predictions against ground truth answers.
Separate questions into time, numerical, and string categories.
- Parameters:
predictions – A list of predictions.
qid2example – A mapping from question ID to ground truth examples.
- Returns:
Lists of scores for time, quantity, and entity predictions.
- Return type:
Tuple[List[int], List[int], List[int]]
- meerqat.data.infoseek.get_results(predictions: List[Dict[str, Any]], qid2example: Dict[str, Dict[str, Any]]) Tuple[float, float, float, float] [source]#
Get evaluation scores for predictions.
- Parameters:
predictions – A list of predictions.
qid2example – A mapping from question ID to ground truth examples.
- Returns:
Final scores for time, quantity, entity, and overall predictions.
- Return type:
Tuple[float, float, float, float]
- meerqat.data.infoseek.harmonic_mean(*args: float) float [source]#
Calculate the harmonic mean of the input arguments.
- meerqat.data.infoseek.evaluate_infoseek(predictions: List[Dict[str, Any]], qid2example: Dict[str, Dict[str, Any]]) Dict[str, float] [source]#
Evaluate predictions against references.
- Parameters:
predictions – A list of predictions.
qid2example – A dictionary of reference with question_id as key.
- Returns:
A dictionary containing the final scores for time, quantity, entity, and overall predictions.
- Return type:
Dict[str, float]
- meerqat.data.infoseek.evaluate_infoseek_full(predictions: Dict[str, List[Dict[str, Any]]], qid2example: Dict[str, Dict[str, Any]]) Dict[str, Any] [source]#
- meerqat.data.infoseek.evaluate(prediction_path: Union[str, List[str]], reference_path: Union[str, Dataset], do_fix_space: bool = False) Dict[str, Any] [source]#
Evaluate predictions against references.
- Parameters:
prediction_path – Path to prediction file.
reference_path – Path to reference file.
- Returns:
A dictionary containing the final scores for time, quantity, entity, and overall predictions.
- Return type:
Dict[str, Any]
- meerqat.data.infoseek.prepare_qid2example(reference: List[Dict[str, Any]]) Dict[str, Dict[str, Any]] [source]#
Convert reference to qid2example dictionary.