Pular para conteúdo

Tesouro Data

benchmarks(include_history=False)

Fetches benchmark data for Brazilian Treasury Bonds from the TN API.

This function retrieves current or historical benchmark data for various Brazilian Treasury bond types (e.g., LTN, LFT, NTN-B). The data is sourced directly from the official Tesouro Nacional API.

Parameters:

Name Type Description Default
include_history bool

If True, includes historical benchmark data. If False (default), only current benchmarks are returned.

False

Returns:

Type Description
DataFrame

pd.DataFrame: A pandas DataFrame containing the benchmark data. The DataFrame includes the following columns:

  • Benchmark (str): The name or identifier of the benchmark (e.g., 'LFT 3 anos').
  • MaturityDate (datetime64[ns]): The maturity date of the benchmark.
  • BondType (str): The type of the bond (e.g., 'LTN', 'LFT', 'NTN-B').
  • StartDate (datetime64[ns]): The start date for the benchmark's period.
  • EndDate (datetime64[ns]): The end date for the benchmark's period.
Notes
  • Data is sourced from the official Tesouro Nacional (Brazilian Treasury) API.
  • An retry mechanism is implemented for SSL certificate verification errors.
  • The API documentation can be found at: https://portal-conhecimento.tesouro.gov.br/catalogo-componentes/api-leil%C3%B5es
  • Rows with any NaN values are dropped before returning the DataFrame.

Examples:

>>> # Get current benchmarks (default behavior)
>>> from pyield import tn
>>> df_current = tn.benchmarks()
>>> # Get historical benchmarks
>>> df_history = tn.benchmarks(include_history=True)
>>> df_history.head()
   StartDate    EndDate BondType MaturityDate     Benchmark
0 2014-01-01 2014-06-30      LFT   2020-03-01    LFT 6 anos
1 2014-01-01 2014-06-01      LTN   2014-10-01   LTN 6 meses
2 2014-01-01 2014-06-30      LTN   2015-04-01  LTN 12 meses
3 2014-01-01 2014-06-30      LTN   2016-04-01  LTN 24 meses
4 2014-01-01 2014-06-30      LTN   2018-01-01  LTN 48 meses
Source code in pyield/tn/benchmark.py
def benchmarks(include_history: bool = False) -> pd.DataFrame:
    """Fetches benchmark data for Brazilian Treasury Bonds from the TN API.

    This function retrieves current or historical benchmark data for various Brazilian
    Treasury bond types (e.g., LTN, LFT, NTN-B). The data is sourced directly from the
    official Tesouro Nacional API.

    Args:
        include_history (bool, optional): If `True`, includes historical benchmark data.
            If `False` (default), only current benchmarks are returned.

    Returns:
        pd.DataFrame: A pandas DataFrame containing the benchmark data.
            The DataFrame includes the following columns:

            *   `Benchmark` (str): The name or identifier of the benchmark
                (e.g., 'LFT 3 anos').
            *   `MaturityDate` (datetime64[ns]): The maturity date of the benchmark.
            *   `BondType` (str): The type of the bond (e.g., 'LTN', 'LFT', 'NTN-B').
            *   `StartDate` (datetime64[ns]): The start date for the benchmark's period.
            *   `EndDate` (datetime64[ns]): The end date for the benchmark's period.

    Notes:
        *   Data is sourced from the official Tesouro Nacional (Brazilian Treasury) API.
        *   An retry mechanism is implemented for SSL certificate verification errors.
        *   The API documentation can be found at:
            https://portal-conhecimento.tesouro.gov.br/catalogo-componentes/api-leil%C3%B5es
        *   Rows with any `NaN` values are dropped before returning the DataFrame.

    Examples:
        >>> # Get current benchmarks (default behavior)
        >>> from pyield import tn
        >>> df_current = tn.benchmarks()

        >>> # Get historical benchmarks
        >>> df_history = tn.benchmarks(include_history=True)
        >>> df_history.head()
           StartDate    EndDate BondType MaturityDate     Benchmark
        0 2014-01-01 2014-06-30      LFT   2020-03-01    LFT 6 anos
        1 2014-01-01 2014-06-01      LTN   2014-10-01   LTN 6 meses
        2 2014-01-01 2014-06-30      LTN   2015-04-01  LTN 12 meses
        3 2014-01-01 2014-06-30      LTN   2016-04-01  LTN 24 meses
        4 2014-01-01 2014-06-30      LTN   2018-01-01  LTN 48 meses
    """
    session = requests.Session()
    include_history_param_value = "S" if include_history else "N"
    api_endpoint = f"{API_BASE_URL}?{API_HISTORY_PARAM}={include_history_param_value}"

    try:
        stn_benchmarks = session.get(api_endpoint)
        stn_benchmarks.raise_for_status()
    except requests.exceptions.SSLError as e:
        logger.error(
            f"SSL error encountered: {e}. Retrying without certificate verification."
        )
        stn_benchmarks = session.get(api_endpoint, verify=False)
        stn_benchmarks.raise_for_status()
    except requests.exceptions.RequestException as e:
        logger.error(f"Error fetching benchmarks from API: {e}")
        return pd.DataFrame()

    response_dict = stn_benchmarks.json()

    # 1a verificação: Verifica se a resposta da API contém dados válidos de registros
    if not response_dict or not response_dict.get("registros"):
        logger.warning("API response did not contain 'registros' key or it was empty.")
        return pd.DataFrame()

    # Tenta criar o DataFrame. O .dropna() pode resultar em um DF vazio.
    df = pd.DataFrame(response_dict["registros"]).dropna()
    df = df.convert_dtypes()

    # 2a verificação: Verifica se o DataFrame resultante (pós-dropna) está vazio
    # Esta verificação é importante porque .dropna() pode remover todas as linhas.
    if df.empty:
        logger.warning(
            "No valid benchmark data found after initial processing"
            "(e.g., all rows had NaNs and were dropped).",
        )
        return pd.DataFrame()

    # Se chegamos até aqui, o DataFrame tem dados e pode ser processado.
    df["VENCIMENTO"] = pd.to_datetime(df["VENCIMENTO"])
    df["TERMINO"] = pd.to_datetime(df["TERMINO"])
    df["INÍCIO"] = pd.to_datetime(df["INÍCIO"])
    df["BENCHMARK"] = df["BENCHMARK"].str.strip()
    df["TÍTULO"] = df["TÍTULO"].str.strip()

    if not include_history:
        # Em tese, a API já retorna apenas benchmarks ativos,
        # mas vamos garantir que o DataFrame só contenha benchmarks ativos
        # considerando o período atual.
        today = dt.now(TIMEZONE_BZ).date()
        today = pd.Timestamp(today)
        df = df.query("INÍCIO <= @today <= TERMINO").reset_index(drop=True)

    # Verifica novamente se o DataFrame ficou vazio *após o filtro condicional*
    # (apenas se `include_history` for False)
    if df.empty and not include_history:
        logger.warning(
            "No current benchmark data found after filtering by active period."
        )

    column_order = [c for c in COLUMN_MAPPING if c in df.columns]
    return (
        df[column_order]
        .rename(columns=COLUMN_MAPPING)
        .sort_values(["StartDate", "BondType", "MaturityDate"])
        .reset_index(drop=True)
    )