AirbyteLoader
Airbyte is a data integration platform for ELT pipelines from APIs, databases & files to warehouses & lakes. It has the largest catalog of ELT connectors to data warehouses and databases.
This covers how to load any source from Airbyte into LangChain documents
Installation
In order to use AirbyteLoader
you need to install the langchain-airbyte
integration package.
%pip install -qU langchain-airbyte
Note: Currently, the airbyte
library does not support Pydantic v2.
Please downgrade to Pydantic v1 to use this package.
Note: This package also currently requires Python 3.10+.
Loading Documents
By default, the AirbyteLoader
will load any structured data from a stream and output yaml-formatted documents.
from langchain_airbyte import AirbyteLoader
loader = AirbyteLoader(
source="source-faker",
stream="users",
config={"count": 10},
)
docs = loader.load()
print(docs[0].page_content[:500])
\`\`\`yaml
academic_degree: PhD
address:
city: Lauderdale Lakes
country_code: FI
postal_code: '75466'
province: New Jersey
state: Hawaii
street_name: Stoneyford
street_number: '1112'
age: 44
blood_type: "O\u2212"
created_at: '2004-04-02T13:05:27+00:00'
email: bread2099+1@outlook.com
gender: Fluid
height: '1.62'
id: 1
language: Belarusian
name: Moses
nationality: Dutch
occupation: Track Worker
telephone: 1-467-194-2318
title: M.Sc.Tech.
updated_at: '2024-02-27T16:41:01+00:00'
weight: 6
You can also specify a custom prompt template for formatting documents:
from langchain_core.prompts import PromptTemplate
loader_templated = AirbyteLoader(
source="source-faker",
stream="users",
config={"count": 10},
template=PromptTemplate.from_template(
"My name is {name} and I am {height} meters tall."
),
)
docs_templated = loader_templated.load()
print(docs_templated[0].page_content)
API Reference:PromptTemplate
My name is Verdie and I am 1.73 meters tall.