Data Sources & Licenses

Every piece of data on SENTINEL is credited. This page lists every upstream source, its license, and how we attribute it.

Our attribution policy

We follow each source's license requirements to the letter. Every page that uses data from a source includes a visible citation, a link to the original, and the license name. Where a source requires share-alike (such as OpenStreetMap or Wikipedia), we either do not derive content from it or we release the derivative under the matching license. Where a source is public domain, we still credit it voluntarily — because transparency is the whole point.

Active sources

GeoNames

Used for: Places module — 854,104 cities, towns, and administrative areas worldwide

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Homepage: geonames.org

Attribution: Every place page on SENTINEL links back to the GeoNames record and credits GeoNames in the page footer.

Project Gutenberg

Used for: Books module — 76,883 classic works in the public domain (US)

License: Public domain in the United States, with the Project Gutenberg trademark policy observed

Homepage: gutenberg.org

Attribution: Every book page links to the original Gutenberg record and includes the Gutenberg ID, author, original publication date, and a "Read on Project Gutenberg" outbound link.

DMOZ / Open Directory Project (historical)

Used for: Seed taxonomy for the directory module

License: CC BY 2.0 (last snapshot 2017, DMOZ closed that year)

Attribution: The directory module honors the spirit of DMOZ as a human-curated web catalog and credits it as the conceptual ancestor.

Planned sources (not yet ingested)

OpenStreetMap POIs

Planned for: Places enrichment — points of interest around cities

License: Open Database License (ODbL) — share-alike applies to the database

Status: Not yet ingested. Will be added with full ODbL-compliant attribution.

USPTO Bulk Patent Data

Planned for: Patents module — 11 million granted US patents

License: Public domain (US Government work)

Status: Planned

OpenAlex

Planned for: Academic papers and authors

License: CC0 1.0 Universal (public domain dedication)

Status: Planned

PubMed

Planned for: Medical research abstracts

License: Generally free for re-use; per-publisher restrictions observed

Status: Planned

UK Companies House

Planned for: Company profiles (UK)

License: Open Government License v3.0

Status: Planned

Sources we will not use

The following types of content are excluded from SENTINEL by policy, regardless of availability:

Want to suggest a source?

If you know of a permissively-licensed dataset that belongs on SENTINEL, email [email protected] with the source URL and license. We review suggestions monthly.