Relation Extraction

The RelationExtraction task performs joint entity and relation extraction, identifying relationships between entities in text.

Usage

relations = {
    "works_for": "A person works for a company or organization.",
    "located_in": "A place or organization is located in a city, country, or region.",
    "founded": "A person founded a company or organization.",
}

fewshot_examples = [
    relation_extraction.FewshotExample(
        text="Henri Dunant founded the Red Cross in Geneva.",
        triplets=[
            RelationTriplet(
                head=RelationEntity(text="Henri Dunant", entity_type="PERSON"),
                relation="founded",
                tail=RelationEntity(text="Red Cross", entity_type="ORGANIZATION"),
                score=1.0,
            ),
            RelationTriplet(
                head=RelationEntity(text="Red Cross", entity_type="ORGANIZATION"),
                relation="located_in",
                tail=RelationEntity(text="Geneva", entity_type="LOCATION"),
                score=1.0,
            ),
        ],
    ),
    relation_extraction.FewshotExample(
        text="Eglantyne Jebb founded Save the Children in London.",
        triplets=[
            RelationTriplet(
                head=RelationEntity(text="Eglantyne Jebb", entity_type="PERSON"),
                relation="founded",
                tail=RelationEntity(text="Save the Children", entity_type="ORGANIZATION"),
                score=1.0,
            ),
            RelationTriplet(
                head=RelationEntity(text="Save the Children", entity_type="ORGANIZATION"),
                relation="located_in",
                tail=RelationEntity(text="London", entity_type="LOCATION"),
                score=1.0,
            ),
        ],
    ),
]

fewshot_args = {"fewshot_examples": fewshot_examples} if fewshot else {}

task = relation_extraction.RelationExtraction(
    relations=relations,
    model=batch_runtime.model,
    model_settings=batch_runtime.model_settings,
    batch_size=batch_runtime.batch_size,
    entity_types=["PERSON", "ORGANIZATION", "LOCATION"],
    **fewshot_args
)

pipe = Pipeline(task)
docs = list(pipe(relation_extraction_docs))

Results

The RelationExtraction task returns a unified Result object containing a list of RelationTriplet objects.

Each triplet includes a confidence score: - GLiNER2: Always present and derived from logits. - LLMs: Self-reported and may be None if not provided by the model.

class Result(pydantic.BaseModel):
    """Result of a relation extraction task.

    Attributes:
        triplets: List of extracted relation triplets.
    """

    triplets: list[RelationTriplet]

Each RelationTriplet consists of: - head: A RelationEntity representing the subject. - relation: The string identifier of the relationship. - tail: A RelationEntity representing the object.

A RelationEntity includes the surface text, entity_type, and character start/end offsets.

Evaluation

Performance of the relation extraction task can be measured using the .evaluate() method.

Metric: Corpus-wide Micro-F1 Score (F1). Triplets are matched based on the head entity text, the relation type, and the tail entity text.
Requirement: Each document must have ground-truth triplets stored in doc.gold[task_id].

report = task.evaluate(docs)
print(f"Relation F1-Score: {report.metrics['F1']}")

Ground Truth Formats

Ground truth has to be specified in doc.meta using Result instances.

Relation extraction predictive task.

`RelationExtraction`

Bases: PredictiveTask[TaskPromptSignature, TaskResult, _TaskBridge]

Extract relations between entities in text.

Source code in sieves/tasks/predictive/relation_extraction/core.py

class RelationExtraction(PredictiveTask[TaskPromptSignature, TaskResult, _TaskBridge]):
    """Extract relations between entities in text."""

    def __init__(
        self,
        relations: Sequence[str] | dict[str, str],
        model: TaskModel,
        entity_types: Sequence[str] | dict[str, str] | None = None,
        task_id: str | None = None,
        include_meta: bool = True,
        batch_size: int = -1,
        prompt_instructions: str | None = None,
        fewshot_examples: Sequence[FewshotExample] = (),
        model_settings: ModelSettings = ModelSettings(),
        condition: Callable[[Doc], bool] | None = None,
    ) -> None:
        """Initialize RelationExtraction task.

        :param relations: Relations to extract. Can be a list of relation types or a dict mapping types to descriptions.
        :param model: Model to use.
        :param entity_types: Optional constraints on entity types involved in relations.
        :param task_id: Task ID.
        :param include_meta: Whether to include meta information generated by the task.
        :param batch_size: Batch size to use for inference. Use -1 to process all documents at once.
        :param prompt_instructions: Custom prompt instructions. If None, default instructions are used.
        :param fewshot_examples: Few-shot examples.
        :param model_settings: Settings for structured generation.
        :param condition: Optional callable that determines whether to process each document.
        """
        self._relations = relations
        self._entity_types = entity_types

        super().__init__(
            model=model,
            task_id=task_id,
            include_meta=include_meta,
            batch_size=batch_size,
            overwrite=False,
            prompt_instructions=prompt_instructions,
            fewshot_examples=fewshot_examples,
            model_settings=model_settings,
            condition=condition,
        )

    @property
    @override
    def fewshot_example_type(self) -> type[FewshotExample]:
        """Return few-shot example type.

        :return: Few-shot example type.
        """
        return FewshotExample

    @property
    @override
    def prompt_signature(self) -> type[pydantic.BaseModel]:
        """Return the unified Pydantic prompt signature for this task.

        :return: Unified Pydantic prompt signature.
        """
        # Define relation type literal if relations are provided.
        if isinstance(self._relations, dict):
            relation_names = list(self._relations.keys())
        else:
            relation_names = list(self._relations)

        RelationType = Literal[*(tuple(relation_names))] if relation_names else str  # type: ignore[invalid-type-form]

        # Define entity type literal if entity types are provided.
        entity_type_names: list[str] = []
        if self._entity_types:
            if isinstance(self._entity_types, dict):
                entity_type_names = list(self._entity_types.keys())
            else:
                entity_type_names = list(self._entity_types)

        EntityType = Literal[*(tuple(entity_type_names))] if entity_type_names else str  # type: ignore[invalid-type-form]

        # Create dynamic models.
        DynamicEntity = pydantic.create_model(
            "RelationEntity",
            text=(str, pydantic.Field(..., description="Surface text of the entity as it appears in the document.")),
            entity_type=(
                EntityType,
                pydantic.Field(..., description="The category or type of the entity."),
            ),
            __doc__="An entity involved in a relation.",
            __base__=pydantic.BaseModel,
        )

        DynamicTriplet = pydantic.create_model(
            "RelationTriplet",
            head=(DynamicEntity, pydantic.Field(..., description="The subject entity (head) of the relation.")),
            relation=(RelationType, pydantic.Field(..., description="The type of relation between the head and tail.")),
            tail=(DynamicEntity, pydantic.Field(..., description="The object entity (tail) of the relation.")),
            score=(
                float | None,
                pydantic.Field(
                    default=None, description="Provide a confidence score for this relation triplet, between 0 and 1."
                ),
            ),
            __doc__="A relation triplet consisting of a head entity, a relation type, and a tail entity.",
            __base__=pydantic.BaseModel,
        )

        return pydantic.create_model(
            "RelationExtractionOutput",
            __doc__="Result of relation extraction. Contains a list of extracted relation triplets.",
            triplets=(
                list[DynamicTriplet],  # type: ignore[invalid-type-form]
                pydantic.Field(..., description="List of extracted relation triplets."),
            ),
        )

    @property
    @override
    def metric(self) -> str:
        return "F1"

    @override
    def _compute_metrics(self, truths: list[Any], preds: list[Any], judge: dspy.LM | None = None) -> dict[str, float]:
        """Compute corpus-level metrics.

        :param truths: List of ground truths.
        :param preds: List of predictions.
        :param judge: Optional DSPy LM instance to use as judge for generative tasks.
        :return: Dictionary of metrics.
        """
        tp = 0
        fp = 0
        fn = 0

        for gold, pred in zip(truths, preds):
            if gold is not None:
                assert isinstance(gold, TaskResult)
                true_triplets = {(t.head.text.lower(), t.relation.lower(), t.tail.text.lower()) for t in gold.triplets}
            else:
                true_triplets = set()

            if pred is not None:
                assert isinstance(pred, TaskResult)
                pred_triplets = {(t.head.text.lower(), t.relation.lower(), t.tail.text.lower()) for t in pred.triplets}
            else:
                pred_triplets = set()

            tp += len(true_triplets & pred_triplets)
            fp += len(pred_triplets - true_triplets)
            fn += len(true_triplets - pred_triplets)

        precision = tp / (tp + fp) if (tp + fp) > 0 else 0.0
        recall = tp / (tp + fn) if (tp + fn) > 0 else 0.0
        f1 = 2 * precision * recall / (precision + recall) if (precision + recall) > 0 else 0.0

        return {self.metric: f1}

    @override
    def _init_bridge(self, model_type: ModelType) -> _TaskBridge:
        if model_type == ModelType.gliner:
            if self._entity_types is not None:
                warnings.warn(
                    "GliNER2 backend does not support entity type constraints for relation extraction. "
                    "The `entity_types` parameter will be ignored.",
                )

            return GliNERBridge(
                task_id=self._task_id,
                prompt_instructions=self._custom_prompt_instructions,
                prompt_signature=self.prompt_signature,
                model_settings=self._model_settings,
                inference_mode=gliner_.InferenceMode.relations,
            )

        bridge_types: dict[ModelType, type[DSPyRelationExtraction | PydanticRelationExtraction]] = {
            ModelType.dspy: DSPyRelationExtraction,
            ModelType.langchain: PydanticRelationExtraction,
            ModelType.outlines: PydanticRelationExtraction,
        }

        try:
            return bridge_types[model_type](
                task_id=self._task_id,
                relations=self._relations,
                entity_types=self._entity_types,
                prompt_instructions=self._custom_prompt_instructions,
                model_settings=self._model_settings,
                prompt_signature=self.prompt_signature,
                model_type=model_type,
                fewshot_examples=self._fewshot_examples,
            )
        except KeyError as err:
            raise KeyError(f"Model type {model_type} is not supported by {self.__class__.__name__}.") from err

    @staticmethod
    @override
    def supports() -> set[ModelType]:
        return {
            ModelType.dspy,
            ModelType.gliner,
            ModelType.langchain,
            ModelType.outlines,
        }

    @override
    @property
    def _state(self) -> dict[str, Any]:
        return {
            **super()._state,
            "relations": self._relations,
            "entity_types": self._entity_types,
        }

    @override
    def to_hf_dataset(self, docs: Iterable[Doc], threshold: float | None = None) -> datasets.Dataset:
        # Define metadata and features.
        entity_feature = datasets.Features(
            {
                "text": datasets.Value("string"),
                "entity_type": datasets.Value("string"),
            }
        )
        triplet_feature = datasets.Features(
            {
                "head": entity_feature,
                "relation": datasets.Value("string"),
                "tail": entity_feature,
                "score": datasets.Value("float32"),
            }
        )
        features = datasets.Features({"text": datasets.Value("string"), "triplets": datasets.Sequence(triplet_feature)})
        info = datasets.DatasetInfo(
            description=f"Relation extraction dataset. Generated with sieves v{Config.get_version()}.",
            features=features,
        )

        # Fetch data used for generating dataset.
        try:
            data: list[dict[str, Any]] = []
            for doc in docs:
                triplets: list[dict[str, Any]] = []
                for triplet in doc.results[self._task_id].triplets:
                    triplets.append(
                        {
                            "head": triplet.head.model_dump(),
                            "relation": triplet.relation,
                            "tail": triplet.tail.model_dump(),
                            "score": triplet.score,
                        }
                    )
                data.append({"text": doc.text, "triplets": triplets})
        except KeyError as err:
            raise KeyError(f"Not all documents have results for this task with ID {self._task_id}") from err

        # Create dataset.
        return datasets.Dataset.from_list(data, features=features, info=info)

    @override
    def distill(
        self,
        base_model_id: str,
        framework: DistillationFramework,
        data: datasets.Dataset | Sequence[Doc],
        output_path: Path | str,
        val_frac: float,
        init_kwargs: dict[str, Any] | None = None,
        train_kwargs: dict[str, Any] | None = None,
        seed: int | None = None,
    ) -> None:
        raise NotImplementedError

    @override
    def _evaluate_dspy_example(self, truth: dspy.Example, pred: dspy.Prediction, trace: Any, model: dspy.LM) -> float:
        # Compute triplet-level F1 score based on (head_text, relation, tail_text) triples.
        # Use lowercase for robust matching.
        true_triplets = {
            (t["head"]["text"].lower(), t["relation"].lower(), t["tail"]["text"].lower()) for t in truth["triplets"]
        }
        pred_triplets = {
            (t["head"]["text"].lower(), t["relation"].lower(), t["tail"]["text"].lower())
            for t in pred.get("triplets", [])
        }

        if not true_triplets:
            base_f1 = 1.0 if not pred_triplets else 0.0
        else:
            precision = len(true_triplets & pred_triplets) / len(pred_triplets) if pred_triplets else 0
            recall = len(true_triplets & pred_triplets) / len(true_triplets)
            base_f1 = 2 * precision * recall / (precision + recall) if (precision + recall) > 0 else 0

        # If score is available, incorporate it.
        if "triplets" in truth and "triplets" in pred:
            true_scores = [t.get("score") for t in truth["triplets"] if t.get("score") is not None]
            pred_scores = [t.get("score") for t in pred["triplets"] if t.get("score") is not None]

            if true_scores and pred_scores:
                true_avg = sum(true_scores) / len(true_scores)
                pred_avg = sum(pred_scores) / len(pred_scores)
                score_accuracy = 1 - abs(true_avg - max(min(pred_avg, 1), 0))
                return (base_f1 + score_accuracy) / 2

        return base_f1

`fewshot_example_type` `property`

Return few-shot example type.

Returns:

Type	Description
`type[FewshotExample]`	Few-shot example type.

`fewshot_examples` `property`

Return few-shot examples.

Returns:

Type	Description
`Sequence[FewshotExample]`	Few-shot examples.

`id` `property`

Return task ID.

Used by pipeline for results and dependency management.

Returns:

Type	Description
`str`	Task ID.

`prompt_signature` `property`

Return the unified Pydantic prompt signature for this task.

Returns:

Type	Description
`type[BaseModel]`	Unified Pydantic prompt signature.

`prompt_signature_description` `property`

Return prompt signature description.

Returns:

Type	Description
`str \| None`	Prompt signature description.

`prompt_template` `property`

Return prompt template.

Returns:

Type	Description
`str`	Prompt template.

`add(other)`

Chain this task with another task or pipeline using the + operator.

This returns a new Pipeline that executes this task first, followed by the task(s) in other. The original task(s)/pipeline are not mutated.

Cache semantics: - If other is a Pipeline, the resulting pipeline adopts other's use_cache setting (because the left-hand side is a single task). - If other is a Task, the resulting pipeline defaults to use_cache=True.

Parameters:

Name	Type	Description	Default
`other`	`Task \| Pipeline`	A `Task` or `Pipeline` to execute after this task.	required

Returns:

Type	Description
`Pipeline`	A new `Pipeline` representing the chained execution.

Raises:

Type	Description
`TypeError`	If `other` is not a `Task` or `Pipeline`.

Source code in sieves/tasks/core.py

def __add__(self, other: Task | Pipeline) -> Pipeline:
    """Chain this task with another task or pipeline using the ``+`` operator.

    This returns a new ``Pipeline`` that executes this task first, followed by the
    task(s) in ``other``. The original task(s)/pipeline are not mutated.

    Cache semantics:
    - If ``other`` is a ``Pipeline``, the resulting pipeline adopts ``other``'s
      ``use_cache`` setting (because the left-hand side is a single task).
    - If ``other`` is a ``Task``, the resulting pipeline defaults to ``use_cache=True``.

    :param other: A ``Task`` or ``Pipeline`` to execute after this task.
    :return: A new ``Pipeline`` representing the chained execution.
    :raises TypeError: If ``other`` is not a ``Task`` or ``Pipeline``.
    """
    # Lazy import to avoid circular dependency at module import time.
    from sieves.pipeline import Pipeline

    if isinstance(other, Pipeline):
        return Pipeline(tasks=[self, *other.tasks], use_cache=other.use_cache)

    if isinstance(other, Task):
        return Pipeline(tasks=[self, other])

    raise TypeError(f"Cannot chain Task with {type(other).__name__}")

`call(docs)`

Execute task with conditional logic.

Checks the condition for each document without materializing all docs upfront. Passes all documents that pass the condition to _call() for proper batching. Documents that fail the condition have results[task_id] set to None.

Parameters:

Name	Type	Description	Default
`docs`	`Iterable[Doc]`	Docs to process.	required

Returns:

Type	Description
`Iterable[Doc]`	Processed docs (in original order).

Source code in sieves/tasks/core.py

def __call__(self, docs: Iterable[Doc]) -> Iterable[Doc]:
    """Execute task with conditional logic.

    Checks the condition for each document without materializing all docs upfront.
    Passes all documents that pass the condition to _call() for proper batching.
    Documents that fail the condition have results[task_id] set to None.

    :param docs: Docs to process.
    :return: Processed docs (in original order).
    """
    docs = iter(docs) if not isinstance(docs, Iterator) else docs

    # Materialize docs in batches. This doesn't incur additional memory overhead, as docs are materialized in
    # batches downstream anyway.
    batch_size = self._batch_size if self._batch_size > 0 else sys.maxsize
    while docs_batch := [doc for doc in itertools.islice(docs, batch_size)]:
        # First pass: determine which docs pass the condition by index.
        passing_indices: set[int] = {
            idx for idx, doc in enumerate(docs_batch) if self._condition is None or self._condition(doc)
        }

        # Process all passing docs in one batch.
        processed = self._call(d for i, d in enumerate(docs_batch) if i in passing_indices)
        processed_iter = iter(processed) if not isinstance(processed, Iterator) else processed

        # Iterate through original docs in order and yield results.
        for idx, doc in enumerate(docs_batch):
            if idx in passing_indices:
                # Doc passed condition - use processed result.
                yield next(processed_iter)
            else:
                # Doc failed condition - set `None` result and yield original.
                doc.results[self.id] = None
                yield doc

`init(relations, model, entity_types=None, task_id=None, include_meta=True, batch_size=-1, prompt_instructions=None, fewshot_examples=(), model_settings=ModelSettings(), condition=None)`

Initialize RelationExtraction task.

Parameters:

Name	Type	Description	Default
`relations`	`Sequence[str] \| dict[str, str]`	Relations to extract. Can be a list of relation types or a dict mapping types to descriptions.	required
`model`	`TaskModel`	Model to use.	required
`entity_types`	`Sequence[str] \| dict[str, str] \| None`	Optional constraints on entity types involved in relations.	`None`
`task_id`	`str \| None`	Task ID.	`None`
`include_meta`	`bool`	Whether to include meta information generated by the task.	`True`
`batch_size`	`int`	Batch size to use for inference. Use -1 to process all documents at once.	`-1`
`prompt_instructions`	`str \| None`	Custom prompt instructions. If None, default instructions are used.	`None`
`fewshot_examples`	`Sequence[FewshotExample]`	Few-shot examples.	`()`
`model_settings`	`ModelSettings`	Settings for structured generation.	`ModelSettings()`
`condition`	`Callable[[Doc], bool] \| None`	Optional callable that determines whether to process each document.	`None`

Source code in sieves/tasks/predictive/relation_extraction/core.py

def __init__(
    self,
    relations: Sequence[str] | dict[str, str],
    model: TaskModel,
    entity_types: Sequence[str] | dict[str, str] | None = None,
    task_id: str | None = None,
    include_meta: bool = True,
    batch_size: int = -1,
    prompt_instructions: str | None = None,
    fewshot_examples: Sequence[FewshotExample] = (),
    model_settings: ModelSettings = ModelSettings(),
    condition: Callable[[Doc], bool] | None = None,
) -> None:
    """Initialize RelationExtraction task.

    :param relations: Relations to extract. Can be a list of relation types or a dict mapping types to descriptions.
    :param model: Model to use.
    :param entity_types: Optional constraints on entity types involved in relations.
    :param task_id: Task ID.
    :param include_meta: Whether to include meta information generated by the task.
    :param batch_size: Batch size to use for inference. Use -1 to process all documents at once.
    :param prompt_instructions: Custom prompt instructions. If None, default instructions are used.
    :param fewshot_examples: Few-shot examples.
    :param model_settings: Settings for structured generation.
    :param condition: Optional callable that determines whether to process each document.
    """
    self._relations = relations
    self._entity_types = entity_types

    super().__init__(
        model=model,
        task_id=task_id,
        include_meta=include_meta,
        batch_size=batch_size,
        overwrite=False,
        prompt_instructions=prompt_instructions,
        fewshot_examples=fewshot_examples,
        model_settings=model_settings,
        condition=condition,
    )

`deserialize(config, **kwargs)` `classmethod`

Generate PredictiveTask instance from config.

Parameters:

Name	Type	Description	Default
`config`	`Config`	Config to generate instance from.	required
`kwargs`	`dict[str, Any]`	Values to inject into loaded config.	`{}`

Returns:

Type	Description
`PredictiveTask[TaskPromptSignature, TaskResult, TaskBridge]`	Deserialized PredictiveTask instance.

Source code in sieves/tasks/predictive/core.py

@classmethod
def deserialize(
    cls, config: Config, **kwargs: dict[str, Any]
) -> PredictiveTask[TaskPromptSignature, TaskResult, TaskBridge]:
    """Generate PredictiveTask instance from config.

    :param config: Config to generate instance from.
    :param kwargs: Values to inject into loaded config.
    :return PredictiveTask[TaskPromptSignature, TaskResult, _TaskBridge]: Deserialized PredictiveTask instance.
    """
    init_dict = config.to_init_dict(cls, **kwargs)
    init_dict["model_settings"] = ModelSettings.model_validate(init_dict["model_settings"])

    return cls(**init_dict)

`evaluate(docs, judge=None, failure_threshold=0.5)`

Evaluate task performance using DSPy-based evaluation.

Parameters:

Name	Type	Description	Default
`docs`	`Iterable[Doc]`	Documents to evaluate.	required
`judge`	`LM \| None`	Optional DSPy LM instance to use as judge for generative tasks.	`None`
`failure_threshold`	`float`	Decision threshold for whether to mark predicitions as failures.	`0.5`

Returns:

Type	Description
`TaskEvaluationReport`	Evaluation report.

Source code in sieves/tasks/predictive/core.py

def evaluate(
    self, docs: Iterable[Doc], judge: dspy.LM | None = None, failure_threshold: float = 0.5
) -> TaskEvaluationReport:
    """Evaluate task performance using DSPy-based evaluation.

    :param docs: Documents to evaluate.
    :param judge: Optional DSPy LM instance to use as judge for generative tasks.
    :param failure_threshold: Decision threshold for whether to mark predicitions as failures.
    :return: Evaluation report.
    """
    truths: list[Any] = []
    preds: list[Any] = []
    failures: list[Doc] = []

    # Evaluate each doc individually to identify failed predictions.
    for doc in docs:
        if self.id not in doc.results:
            continue

        pred = doc.results[self.id]
        gold = doc.gold.get(self.id, None)

        # Accumulate for corpus-level metrics.
        truths.append(gold)
        preds.append(pred)

        # If gold or prediction is None: we cannot do proper evalution, so we just check whether they're both None
        # to compute score for failure analysis.
        if gold is None or pred is None:
            if gold is not None or pred is not None:
                failures.append(doc)
        else:
            # Convert result and gold to DSPy representation.
            truth = dspy.Example(**self._task_result_to_dspy_dict(gold))
            pred_dspy = dspy.Prediction(**self._task_result_to_dspy_dict(pred))

            # Call internal evaluation logic for per-doc failure analysis.
            score = self._evaluate_dspy_example(truth, pred_dspy, trace=None, model=judge)

            if score < failure_threshold:
                failures.append(doc)

    # Evaluate on corpus level to obtain representative metrics.
    metrics = self._compute_metrics(truths, preds, judge=judge)

    return TaskEvaluationReport(
        metrics=metrics,
        task_id=self.id,
        failures=failures,
    )

`optimize(optimizer, verbose=True)`

Optimize task prompt and few-shot examples with the available optimization config.

Updates task to use best prompt and few-shot examples found by the optimizer.

Parameters:

Name	Type	Description	Default
`optimizer`	`Optimizer`	Optimizer to run.	required
`verbose`	`bool`	Whether to suppress output. DSPy produces a good amount of logs, so this can be useful to not pollute your terminal. Only warnings and errors will be printed.	`True`

Returns:

Type	Description
`tuple[str, Sequence[FewshotExample]]`	Best found prompt and few-shot examples.

Source code in sieves/tasks/predictive/core.py

def optimize(self, optimizer: optimization.Optimizer, verbose: bool = True) -> tuple[str, Sequence[FewshotExample]]:
    """Optimize task prompt and few-shot examples with the available optimization config.

    Updates task to use best prompt and few-shot examples found by the optimizer.

    :param optimizer: Optimizer to run.
    :param verbose: Whether to suppress output. DSPy produces a good amount of logs, so this can be useful to
        not pollute your terminal. Only warnings and errors will be printed.

    :return tuple[str, Sequence[FewshotExample]]: Best found prompt and few-shot examples.
    """
    assert len(self._fewshot_examples) > 1, "At least two few-shot examples need to be provided to optimize."

    # Run optimizer to get best prompt and few-shot examples.
    signature = self._get_task_signature()
    dspy_examples = [ex.to_dspy() for ex in self._fewshot_examples]

    def _pred_eval(truth: dspy.Example, pred: dspy.Prediction, trace: Any | None = None) -> float:
        """Wrap optimization evaluation, inject model.

        :param truth: Ground truth.
        :param pred: Predicted value.
        :param trace: Optional trace information.
        :return: Metric value between 0.0 and 1.0.
        :raises KeyError: If target fields are missing from truth or prediction.
        :raises ValueError: If similarity score cannot be parsed from LLM response.
        """
        return self._evaluate_dspy_example(truth, pred, trace, model=optimizer.model)

    if verbose:
        best_prompt, best_examples = optimizer(signature, dspy_examples, _pred_eval, verbose=verbose)
    else:
        # Temporarily suppress DSPy logs.
        dspy_logger = logging.getLogger("dspy")
        optuna_logger = logging.getLogger("optuna")
        original_dspy_level = dspy_logger.level
        original_optuna_level = optuna_logger.level

        try:
            dspy_logger.setLevel(logging.ERROR)
            optuna_logger.setLevel(logging.ERROR)
            with warnings.catch_warnings():
                warnings.simplefilter("ignore")
                best_prompt, best_examples = optimizer(signature, dspy_examples, _pred_eval, verbose=verbose)
        finally:
            dspy_logger.setLevel(original_dspy_level)
            optuna_logger.setLevel(original_optuna_level)

    # Update few-shot examples and prompt instructions.
    fewshot_example_cls = self._fewshot_examples[0].__class__
    self._fewshot_examples = [fewshot_example_cls.from_dspy(ex) for ex in best_examples]
    self._validate_fewshot_examples()
    self._custom_prompt_instructions = best_prompt

    # Reinitialize bridge to use new prompt and few-shot examples.
    self._bridge = self._init_bridge(ModelType.get_model_type(self._model_wrapper))

    return best_prompt, self._fewshot_examples

`serialize()`

Serialize task.

Returns:

Type	Description
`Config`	Config instance.

Source code in sieves/tasks/core.py

def serialize(self) -> Config:
    """Serialize task.

    :return: Config instance.
    """
    return Config.create(self.__class__, {k: Attribute(value=v) for k, v in self._state.items()})

Bridges for relation extraction task.

`DSPyRelationExtraction`

Bases: RelationExtractionBridge[PromptSignature, Result, InferenceMode]

DSPy bridge for relation extraction.

Source code in sieves/tasks/predictive/relation_extraction/bridges.py

class DSPyRelationExtraction(RelationExtractionBridge[dspy_.PromptSignature, dspy_.Result, dspy_.InferenceMode]):
    """DSPy bridge for relation extraction."""

    @override
    def _validate(self) -> None:
        assert self._model_type == ModelType.dspy

    @property
    @override
    def _chunk_extractor(self) -> Callable[[Any], Iterable[pydantic.BaseModel]]:
        return lambda res: res.triplets

    @override
    @property
    def _default_prompt_instructions(self) -> str:
        return ""

    @override
    @property
    def inference_mode(self) -> dspy_.InferenceMode:
        return self._model_settings.inference_mode or dspy_.InferenceMode.predict

`model_settings` `property`

Return model settings.

Returns:

Type	Description
`ModelSettings`	Model settings.

`model_type` `property`

Return model type.

Returns:

Type	Description
`ModelType`	Model type.

`prompt_template` `property`

Return prompt template.

Chains _prompt_instructions, _prompt_example_xml and _prompt_conclusion.

Note: different model have different expectations as to how a prompt should look like. E.g. outlines supports the Jinja 2 templating format for insertion of values and few-shot examples, whereas DSPy integrates these things in a different value in the workflow and hence expects the prompt not to include these things. Mind model-specific expectations when creating a prompt template.

Returns:

Type	Description
`str`	Prompt template as string. None if not used by model wrapper.

`init(task_id, relations, entity_types, prompt_instructions, model_settings, prompt_signature, model_type, fewshot_examples=())`

Initialize relation extraction bridge.

Parameters:

Name	Type	Description	Default
`task_id`	`str`	Task ID.	required
`relations`	`Sequence[str] \| dict[str, str]`	Relations to extract. Can be a list of relation types or a dict mapping types to descriptions.	required
`entity_types`	`Sequence[str] \| dict[str, str] \| None`	Entity types constraints.	required
`prompt_instructions`	`str \| None`	Custom prompt instructions. If None, default instructions are used.	required
`model_settings`	`ModelSettings`	Settings for structured generation.	required
`prompt_signature`	`type[BaseModel]`	Unified Pydantic prompt signature.	required
`model_type`	`ModelType`	Model type.	required
`fewshot_examples`	`Sequence[BaseModel]`	Few-shot examples.	`()`

Source code in sieves/tasks/predictive/relation_extraction/bridges.py

def __init__(
    self,
    task_id: str,
    relations: Sequence[str] | dict[str, str],
    entity_types: Sequence[str] | dict[str, str] | None,
    prompt_instructions: str | None,
    model_settings: ModelSettings,
    prompt_signature: type[pydantic.BaseModel],
    model_type: ModelType,
    fewshot_examples: Sequence[pydantic.BaseModel] = (),
):
    """Initialize relation extraction bridge.

    :param task_id: Task ID.
    :param relations: Relations to extract. Can be a list of relation types or a dict mapping types to descriptions.
    :param entity_types: Entity types constraints.
    :param prompt_instructions: Custom prompt instructions. If None, default instructions are used.
    :param model_settings: Settings for structured generation.
    :param prompt_signature: Unified Pydantic prompt signature.
    :param model_type: Model type.
    :param fewshot_examples: Few-shot examples.
    """
    super().__init__(
        task_id=task_id,
        prompt_instructions=prompt_instructions,
        overwrite=False,
        model_settings=model_settings,
        prompt_signature=prompt_signature,
        model_type=model_type,
        fewshot_examples=fewshot_examples,
    )
    if isinstance(relations, dict):
        self._relations = list(relations.keys())
        self._relation_descriptions = relations
    else:
        self._relations = list(relations)
        self._relation_descriptions = {}

    if isinstance(entity_types, dict):
        self._entity_types = list(entity_types.keys())
        self._entity_type_descriptions = entity_types
    elif entity_types is not None:
        self._entity_types = list(entity_types)
        self._entity_type_descriptions = {}
    else:
        self._entity_types = None
        self._entity_type_descriptions = {}

    self._consolidation_strategy = MultiEntityConsolidation(extractor=self._chunk_extractor)

`extract(docs)`

Extract all values from doc instances that are to be injected into the prompts.

Parameters:

Name	Type	Description	Default
`docs`	`Sequence[Doc]`	Docs to extract values from.	required

Returns:

Type	Description
`Sequence[dict[str, Any]]`	All values from doc instances that are to be injected into the prompts as a sequence.

Source code in sieves/tasks/predictive/bridges.py

def extract(self, docs: Sequence[Doc]) -> Sequence[dict[str, Any]]:
    """Extract all values from doc instances that are to be injected into the prompts.

    :param docs: Docs to extract values from.
    :return: All values from doc instances that are to be injected into the prompts as a sequence.
    """
    return [{"text": doc.text if doc.text else None} for doc in docs]

`PydanticRelationExtraction`

Bases: RelationExtractionBridge[BaseModel, BaseModel | list[Any], ModelWrapperInferenceMode], ABC

Base class for Pydantic-based relation extraction bridges.

Source code in sieves/tasks/predictive/relation_extraction/bridges.py

class PydanticRelationExtraction(
    RelationExtractionBridge[pydantic.BaseModel, pydantic.BaseModel | list[Any], ModelWrapperInferenceMode], abc.ABC
):
    """Base class for Pydantic-based relation extraction bridges."""

    @override
    def _validate(self) -> None:
        assert self._model_type in {ModelType.langchain, ModelType.outlines}

    @property
    @override
    def _chunk_extractor(self) -> Callable[[Any], Iterable[pydantic.BaseModel]]:
        return lambda res: res.triplets

    @override
    @property
    def _default_prompt_instructions(self) -> str:
        return (
            "Extract relations between entities in the text.\n"
            f"Relations: {self._relations}\n"
            f"Entity Types: {self._entity_types or 'Any'}\n"
            "Return a list of triplets with head, relation, tail, and a confidence score between 0.0 and 1.0."
        )

    @override
    @property
    def _prompt_conclusion(self) -> str | None:
        return "========\n<text>{{ text }}</text>"

    @override
    @property
    def model_type(self) -> ModelType:
        return self._model_type

    @override
    @property
    def inference_mode(self) -> outlines_.InferenceMode | langchain_.InferenceMode:
        if self._model_type == ModelType.outlines:
            return self._model_settings.inference_mode or outlines_.InferenceMode.json
        elif self._model_type == ModelType.langchain:
            return self._model_settings.inference_mode or langchain_.InferenceMode.structured

        raise ValueError(f"Unsupported model type: {self._model_type}")

`model_settings` `property`

Return model settings.

Returns:

Type	Description
`ModelSettings`	Model settings.

`prompt_template` `property`

Return prompt template.

Chains _prompt_instructions, _prompt_example_xml and _prompt_conclusion.

Note: different model have different expectations as to how a prompt should look like. E.g. outlines supports the Jinja 2 templating format for insertion of values and few-shot examples, whereas DSPy integrates these things in a different value in the workflow and hence expects the prompt not to include these things. Mind model-specific expectations when creating a prompt template.

Returns:

Type	Description
`str`	Prompt template as string. None if not used by model wrapper.

`init(task_id, relations, entity_types, prompt_instructions, model_settings, prompt_signature, model_type, fewshot_examples=())`

Initialize relation extraction bridge.

Parameters:

Name	Type	Description	Default
`task_id`	`str`	Task ID.	required
`relations`	`Sequence[str] \| dict[str, str]`	Relations to extract. Can be a list of relation types or a dict mapping types to descriptions.	required
`entity_types`	`Sequence[str] \| dict[str, str] \| None`	Entity types constraints.	required
`prompt_instructions`	`str \| None`	Custom prompt instructions. If None, default instructions are used.	required
`model_settings`	`ModelSettings`	Settings for structured generation.	required
`prompt_signature`	`type[BaseModel]`	Unified Pydantic prompt signature.	required
`model_type`	`ModelType`	Model type.	required
`fewshot_examples`	`Sequence[BaseModel]`	Few-shot examples.	`()`

Source code in sieves/tasks/predictive/relation_extraction/bridges.py

def __init__(
    self,
    task_id: str,
    relations: Sequence[str] | dict[str, str],
    entity_types: Sequence[str] | dict[str, str] | None,
    prompt_instructions: str | None,
    model_settings: ModelSettings,
    prompt_signature: type[pydantic.BaseModel],
    model_type: ModelType,
    fewshot_examples: Sequence[pydantic.BaseModel] = (),
):
    """Initialize relation extraction bridge.

    :param task_id: Task ID.
    :param relations: Relations to extract. Can be a list of relation types or a dict mapping types to descriptions.
    :param entity_types: Entity types constraints.
    :param prompt_instructions: Custom prompt instructions. If None, default instructions are used.
    :param model_settings: Settings for structured generation.
    :param prompt_signature: Unified Pydantic prompt signature.
    :param model_type: Model type.
    :param fewshot_examples: Few-shot examples.
    """
    super().__init__(
        task_id=task_id,
        prompt_instructions=prompt_instructions,
        overwrite=False,
        model_settings=model_settings,
        prompt_signature=prompt_signature,
        model_type=model_type,
        fewshot_examples=fewshot_examples,
    )
    if isinstance(relations, dict):
        self._relations = list(relations.keys())
        self._relation_descriptions = relations
    else:
        self._relations = list(relations)
        self._relation_descriptions = {}

    if isinstance(entity_types, dict):
        self._entity_types = list(entity_types.keys())
        self._entity_type_descriptions = entity_types
    elif entity_types is not None:
        self._entity_types = list(entity_types)
        self._entity_type_descriptions = {}
    else:
        self._entity_types = None
        self._entity_type_descriptions = {}

    self._consolidation_strategy = MultiEntityConsolidation(extractor=self._chunk_extractor)

`extract(docs)`

Extract all values from doc instances that are to be injected into the prompts.

Parameters:

Name	Type	Description	Default
`docs`	`Sequence[Doc]`	Docs to extract values from.	required

Returns:

Type	Description
`Sequence[dict[str, Any]]`	All values from doc instances that are to be injected into the prompts as a sequence.

Source code in sieves/tasks/predictive/bridges.py

def extract(self, docs: Sequence[Doc]) -> Sequence[dict[str, Any]]:
    """Extract all values from doc instances that are to be injected into the prompts.

    :param docs: Docs to extract values from.
    :return: All values from doc instances that are to be injected into the prompts as a sequence.
    """
    return [{"text": doc.text if doc.text else None} for doc in docs]

`RelationExtractionBridge`

Bases: Bridge[_BridgePromptSignature, _BridgeResult, ModelWrapperInferenceMode], ABC

Abstract base class for relation extraction bridges.

Source code in sieves/tasks/predictive/relation_extraction/bridges.py

class RelationExtractionBridge(Bridge[_BridgePromptSignature, _BridgeResult, ModelWrapperInferenceMode], abc.ABC):
    """Abstract base class for relation extraction bridges."""

    def __init__(
        self,
        task_id: str,
        relations: Sequence[str] | dict[str, str],
        entity_types: Sequence[str] | dict[str, str] | None,
        prompt_instructions: str | None,
        model_settings: ModelSettings,
        prompt_signature: type[pydantic.BaseModel],
        model_type: ModelType,
        fewshot_examples: Sequence[pydantic.BaseModel] = (),
    ):
        """Initialize relation extraction bridge.

        :param task_id: Task ID.
        :param relations: Relations to extract. Can be a list of relation types or a dict mapping types to descriptions.
        :param entity_types: Entity types constraints.
        :param prompt_instructions: Custom prompt instructions. If None, default instructions are used.
        :param model_settings: Settings for structured generation.
        :param prompt_signature: Unified Pydantic prompt signature.
        :param model_type: Model type.
        :param fewshot_examples: Few-shot examples.
        """
        super().__init__(
            task_id=task_id,
            prompt_instructions=prompt_instructions,
            overwrite=False,
            model_settings=model_settings,
            prompt_signature=prompt_signature,
            model_type=model_type,
            fewshot_examples=fewshot_examples,
        )
        if isinstance(relations, dict):
            self._relations = list(relations.keys())
            self._relation_descriptions = relations
        else:
            self._relations = list(relations)
            self._relation_descriptions = {}

        if isinstance(entity_types, dict):
            self._entity_types = list(entity_types.keys())
            self._entity_type_descriptions = entity_types
        elif entity_types is not None:
            self._entity_types = list(entity_types)
            self._entity_type_descriptions = {}
        else:
            self._entity_types = None
            self._entity_type_descriptions = {}

        self._consolidation_strategy = MultiEntityConsolidation(extractor=self._chunk_extractor)

    @override
    @property
    def prompt_signature(self) -> _BridgePromptSignature:
        return convert_to_signature(
            model_cls=self._pydantic_signature,
            model_type=self.model_type,
            mode="relations",
        )  # type: ignore[return-value]

    @property
    @abc.abstractmethod
    def _chunk_extractor(self) -> Callable[[Any], Iterable[pydantic.BaseModel]]:
        """Return a callable that extracts a list of entities from a raw chunk result.

        :return: Extractor callable.
        """

    def _get_relation_descriptions(self) -> str:
        """Return relation descriptions as a string.

        :return: Relation descriptions.
        """
        descs: list[str] = []
        for rel in self._relations:
            if rel in self._relation_descriptions:
                descs.append(
                    f"  <relation_description>\n    <relation>{rel}</relation>\n    <description>"
                    f"{self._relation_descriptions[rel]}</description>\n  </relation_description>"
                )
            else:
                descs.append(f"  <relation>{rel}</relation>")
        return "\n".join(descs)

    def _get_entity_type_descriptions(self) -> str:
        """Return entity type descriptions as a string.

        :return: Entity type descriptions.
        """
        if self._entity_types is None:
            return "Unbounded"

        descs: list[str] = []
        for et in self._entity_types:
            if et in self._entity_type_descriptions:
                descs.append(
                    f"  <entity_type_description>\n    <type>{et}</type>\n    <description>"
                    f"{self._entity_type_descriptions[et]}</description>\n  </entity_type_description>"
                )
            else:
                descs.append(f"  <type>{et}</type>")

        return "\n".join(descs)

    def _process_triplets(self, raw_triplets: list[Any]) -> list[RelationTriplet]:
        """Convert raw triplets from model to RelationTriplet objects.

        :param raw_triplets: Raw triplets from the model.
        :return: Processed RelationTriplet objects.
        """
        processed: list[RelationTriplet] = []
        for raw in raw_triplets:
            head_text = getattr(raw.head, "text", "")
            head_type = getattr(raw.head, "entity_type", "")

            tail_text = getattr(raw.tail, "text", "")
            tail_type = getattr(raw.tail, "entity_type", "")

            processed.append(
                RelationTriplet(
                    head=RelationEntity(text=head_text, entity_type=head_type),
                    relation=getattr(raw, "relation", ""),
                    tail=RelationEntity(text=tail_text, entity_type=tail_type),
                    score=getattr(raw, "score", None),
                )
            )

        return processed

    @override
    def integrate(self, results: Sequence[_BridgeResult | list[Any]], docs: list[Doc]) -> list[Doc]:
        for doc, result in zip(docs, results):
            # Handle both model result objects and raw lists from consolidation.
            raw_triplets = result if isinstance(result, list) else getattr(result, "triplets", [])
            doc.results[self._task_id] = Result(triplets=self._process_triplets(raw_triplets))

        return docs

    @override
    def consolidate(
        self,
        results: Sequence[_BridgeResult],
        docs_offsets: list[tuple[int, int]],
    ) -> Sequence[list[pydantic.BaseModel]]:
        return self._consolidation_strategy.consolidate(results, docs_offsets)

`inference_mode` `abstractmethod` `property`

Return inference mode.

Returns:

Type	Description
`ModelWrapperInferenceMode`	Inference mode.

`model_settings` `property`

Return model settings.

Returns:

Type	Description
`ModelSettings`	Model settings.

`model_type` `property`

Return model type.

Returns:

Type	Description
`ModelType`	Model type.

`prompt_template` `property`

Return prompt template.

Chains _prompt_instructions, _prompt_example_xml and _prompt_conclusion.

Note: different model have different expectations as to how a prompt should look like. E.g. outlines supports the Jinja 2 templating format for insertion of values and few-shot examples, whereas DSPy integrates these things in a different value in the workflow and hence expects the prompt not to include these things. Mind model-specific expectations when creating a prompt template.

Returns:

Type	Description
`str`	Prompt template as string. None if not used by model wrapper.

`init(task_id, relations, entity_types, prompt_instructions, model_settings, prompt_signature, model_type, fewshot_examples=())`

Initialize relation extraction bridge.

Parameters:

Name	Type	Description	Default
`task_id`	`str`	Task ID.	required
`relations`	`Sequence[str] \| dict[str, str]`	Relations to extract. Can be a list of relation types or a dict mapping types to descriptions.	required
`entity_types`	`Sequence[str] \| dict[str, str] \| None`	Entity types constraints.	required
`prompt_instructions`	`str \| None`	Custom prompt instructions. If None, default instructions are used.	required
`model_settings`	`ModelSettings`	Settings for structured generation.	required
`prompt_signature`	`type[BaseModel]`	Unified Pydantic prompt signature.	required
`model_type`	`ModelType`	Model type.	required
`fewshot_examples`	`Sequence[BaseModel]`	Few-shot examples.	`()`

Source code in sieves/tasks/predictive/relation_extraction/bridges.py

def __init__(
    self,
    task_id: str,
    relations: Sequence[str] | dict[str, str],
    entity_types: Sequence[str] | dict[str, str] | None,
    prompt_instructions: str | None,
    model_settings: ModelSettings,
    prompt_signature: type[pydantic.BaseModel],
    model_type: ModelType,
    fewshot_examples: Sequence[pydantic.BaseModel] = (),
):
    """Initialize relation extraction bridge.

    :param task_id: Task ID.
    :param relations: Relations to extract. Can be a list of relation types or a dict mapping types to descriptions.
    :param entity_types: Entity types constraints.
    :param prompt_instructions: Custom prompt instructions. If None, default instructions are used.
    :param model_settings: Settings for structured generation.
    :param prompt_signature: Unified Pydantic prompt signature.
    :param model_type: Model type.
    :param fewshot_examples: Few-shot examples.
    """
    super().__init__(
        task_id=task_id,
        prompt_instructions=prompt_instructions,
        overwrite=False,
        model_settings=model_settings,
        prompt_signature=prompt_signature,
        model_type=model_type,
        fewshot_examples=fewshot_examples,
    )
    if isinstance(relations, dict):
        self._relations = list(relations.keys())
        self._relation_descriptions = relations
    else:
        self._relations = list(relations)
        self._relation_descriptions = {}

    if isinstance(entity_types, dict):
        self._entity_types = list(entity_types.keys())
        self._entity_type_descriptions = entity_types
    elif entity_types is not None:
        self._entity_types = list(entity_types)
        self._entity_type_descriptions = {}
    else:
        self._entity_types = None
        self._entity_type_descriptions = {}

    self._consolidation_strategy = MultiEntityConsolidation(extractor=self._chunk_extractor)

`extract(docs)`

Extract all values from doc instances that are to be injected into the prompts.

Parameters:

Name	Type	Description	Default
`docs`	`Sequence[Doc]`	Docs to extract values from.	required

Returns:

Type	Description
`Sequence[dict[str, Any]]`	All values from doc instances that are to be injected into the prompts as a sequence.

Source code in sieves/tasks/predictive/bridges.py

def extract(self, docs: Sequence[Doc]) -> Sequence[dict[str, Any]]:
    """Extract all values from doc instances that are to be injected into the prompts.

    :param docs: Docs to extract values from.
    :return: All values from doc instances that are to be injected into the prompts as a sequence.
    """
    return [{"text": doc.text if doc.text else None} for doc in docs]

Schemas for relation extraction task.

`FewshotExample`

Bases: FewshotExample

Few-shot example for relation extraction.

Attributes: text: Input text. triplets: Expected relation triplets.

Source code in sieves/tasks/predictive/schemas/relation_extraction.py

class FewshotExample(BaseFewshotExample):
    """Few-shot example for relation extraction.

    Attributes:
        text: Input text.
        triplets: Expected relation triplets.
    """

    triplets: list[RelationTriplet]

    @property
    def target_fields(self) -> tuple[str, ...]:
        """Return target fields.

        :return: Target fields.
        """
        return ("triplets",)

`input_fields` `property`

Defines which fields are inputs.

Returns:

Type	Description
`Sequence[str]`	Sequence of field names.

`target_fields` `property`

Return target fields.

Returns:

Type	Description
`tuple[str, ...]`	Target fields.

`from_dspy(example)` `classmethod`

Convert from dspy.Example.

Parameters:

Name	Type	Description	Default
`example`	`Example`	Example as `dspy.Example`.	required

Returns:

Type	Description
`Self`	Example as `FewshotExample`.

Source code in sieves/tasks/predictive/schemas/core.py

@classmethod
def from_dspy(cls, example: dspy.Example) -> Self:
    """Convert from `dspy.Example`.

    :param example: Example as `dspy.Example`.
    :returns: Example as `FewshotExample`.
    """
    return cls(**example)

`to_dspy()`

Convert to dspy.Example.

Returns:

Type	Description
`Example`	Example as `dspy.Example`.

Source code in sieves/tasks/predictive/schemas/core.py

def to_dspy(self) -> dspy.Example:
    """Convert to `dspy.Example`.

    :returns: Example as `dspy.Example`.
    """
    return dspy.Example(**ModelWrapper.convert_fewshot_examples([self])[0]).with_inputs(*self.input_fields)

`RelationEntity`

Bases: BaseModel

Entity involved in a relation.

Attributes: text: Surface text of the entity. entity_type: Type of the entity.

Source code in sieves/tasks/predictive/schemas/relation_extraction.py

class RelationEntity(pydantic.BaseModel, frozen=True):
    """Entity involved in a relation.

    Attributes:
        text: Surface text of the entity.
        entity_type: Type of the entity.
    """

    text: str
    entity_type: str

`RelationEntityWithContext`

Bases: BaseModel

Entity mention with text span, type, and context for span discovery.

Attributes: text: Surface text of the entity. context: Short context around the entity. entity_type: Type of the entity.

Source code in sieves/tasks/predictive/schemas/relation_extraction.py

class RelationEntityWithContext(pydantic.BaseModel):
    """Entity mention with text span, type, and context for span discovery.

    Attributes:
        text: Surface text of the entity.
        context: Short context around the entity.
        entity_type: Type of the entity.
    """

    text: str
    context: str
    entity_type: str

`RelationTriplet`

Bases: BaseModel

Triplet representing a relation between two entities.

Attributes: head: The subject entity. relation: The type of relation. tail: The object entity. score: Confidence score.

Source code in sieves/tasks/predictive/schemas/relation_extraction.py

class RelationTriplet(pydantic.BaseModel, frozen=True):
    """Triplet representing a relation between two entities.

    Attributes:
        head: The subject entity.
        relation: The type of relation.
        tail: The object entity.
        score: Confidence score.
    """

    head: RelationEntity
    relation: str
    tail: RelationEntity
    score: float | None = None

`RelationTripletWithContext`

Bases: BaseModel

Triplet with context for span discovery.

Attributes: head: The head entity with context. relation: The relation type. tail: The tail entity with context. score: Confidence score.

Source code in sieves/tasks/predictive/schemas/relation_extraction.py

class RelationTripletWithContext(pydantic.BaseModel):
    """Triplet with context for span discovery.

    Attributes:
        head: The head entity with context.
        relation: The relation type.
        tail: The tail entity with context.
        score: Confidence score.
    """

    head: RelationEntityWithContext
    relation: str
    tail: RelationEntityWithContext
    score: float | None = None

`Result`

Bases: BaseModel

Result of a relation extraction task.

Attributes: triplets: List of extracted relation triplets.

Source code in sieves/tasks/predictive/schemas/relation_extraction.py

class Result(pydantic.BaseModel):
    """Result of a relation extraction task.

    Attributes:
        triplets: List of extracted relation triplets.
    """

    triplets: list[RelationTriplet]

Relation Extraction

Usage

Results

Evaluation

Ground Truth Formats

RelationExtraction

fewshot_example_type property

fewshot_examples property

id property

prompt_signature property

prompt_signature_description property

prompt_template property

__add__(other)

__call__(docs)

__init__(relations, model, entity_types=None, task_id=None, include_meta=True, batch_size=-1, prompt_instructions=None, fewshot_examples=(), model_settings=ModelSettings(), condition=None)

deserialize(config, **kwargs) classmethod

evaluate(docs, judge=None, failure_threshold=0.5)

optimize(optimizer, verbose=True)

serialize()

DSPyRelationExtraction

model_settings property

model_type property

prompt_template property

__init__(task_id, relations, entity_types, prompt_instructions, model_settings, prompt_signature, model_type, fewshot_examples=())

extract(docs)

PydanticRelationExtraction

model_settings property

prompt_template property

__init__(task_id, relations, entity_types, prompt_instructions, model_settings, prompt_signature, model_type, fewshot_examples=())

extract(docs)

RelationExtractionBridge

inference_mode abstractmethod property

model_settings property

model_type property

prompt_template property

__init__(task_id, relations, entity_types, prompt_instructions, model_settings, prompt_signature, model_type, fewshot_examples=())

extract(docs)

FewshotExample

input_fields property

target_fields property

from_dspy(example) classmethod

to_dspy()

RelationEntity

RelationEntityWithContext

RelationTriplet

RelationTripletWithContext

Result

`RelationExtraction`

`fewshot_example_type` `property`

`fewshot_examples` `property`

`id` `property`

`prompt_signature` `property`

`prompt_signature_description` `property`

`prompt_template` `property`

`add(other)`

`call(docs)`

`init(relations, model, entity_types=None, task_id=None, include_meta=True, batch_size=-1, prompt_instructions=None, fewshot_examples=(), model_settings=ModelSettings(), condition=None)`

`deserialize(config, **kwargs)` `classmethod`

`evaluate(docs, judge=None, failure_threshold=0.5)`

`optimize(optimizer, verbose=True)`

`serialize()`

`DSPyRelationExtraction`

`model_settings` `property`

`model_type` `property`

`prompt_template` `property`

`init(task_id, relations, entity_types, prompt_instructions, model_settings, prompt_signature, model_type, fewshot_examples=())`

`extract(docs)`

`PydanticRelationExtraction`

`model_settings` `property`

`prompt_template` `property`

`init(task_id, relations, entity_types, prompt_instructions, model_settings, prompt_signature, model_type, fewshot_examples=())`

`extract(docs)`

`RelationExtractionBridge`

`inference_mode` `abstractmethod` `property`

`model_settings` `property`

`model_type` `property`

`prompt_template` `property`

`init(task_id, relations, entity_types, prompt_instructions, model_settings, prompt_signature, model_type, fewshot_examples=())`

`extract(docs)`

`FewshotExample`

`input_fields` `property`

`target_fields` `property`

`from_dspy(example)` `classmethod`

`to_dspy()`

`RelationEntity`

`RelationEntityWithContext`

`RelationTriplet`

`RelationTripletWithContext`

`Result`