Overview

Source refers to any system, provider, or platform from which Ledge retrieves data. Sources can be public financial institutions (such as banks or Payment Service Providers (PSPs)), ERPs, or custom, proprietary sources. Each source can contain multiple datasets, which are distinct collections of data. For example, a source might provide separate datasets for transaction reports, chargeback reports, account balances etc.

Basic Components

Datasets: Each source may have multiple datasets. These datasets may follow the same schema or have distinct structures.
Schema: A schema defines the structure and format of the dataset. Multiple datasets within a source may share the same schema (e.g. transactions from different accounts).
Data Types: Datasets can be categorized into one of the following types:
- Transaction: Includes fields such as date, amount, description (memo) and other supporting details.
- Balance: Reflects the balance of an account or similar summary information.
- Other: Used for datasets that do not fit into the Transaction or Balance categories, and are used for enrichment, or for deeper integrations such as ERPs.

For more information on the specific structure of each data type, see the Ingestion section.

Data Ingestion

Sources fetch data periodically to ensure up-to-date information. The exact schedule may vary depending on the type of source and its configuration. Data fetched is ingested and processed to enforce data types, handle empty/null values, and ensure data accuracy and completeness.

Data is transformed to fit the source’s schema across various data types, including timestamps & timezones (all timestamps are stored in UTC), monetary values (i.e. amount and a currency), string pattern matching etc. This may involve restructuring, joining, or pivoting data as needed.

Transaction records are version-controlled, and during the ingestion process duplicate records either get dropped or replace the previous version to maintain data quality. What constitutes a duplicate or a version is configurable.

In addition to collecting raw data from the source, Ledge makes it possible to enrich Source data using additional (potentially 3rd party) datasets. Enrichment is typically performed using a unique identifier (such as a transaction ID, account ID, or user identifier) to lookup and merge external data with the source data.

PreviousGetting Started NextData Ingestion

Last updated 5 months ago