• Destragras
    link
    fedilink
    12
    edit-2
    1 year ago

    I tried finding information on what indexer they are using. Are they using their own?

    Edit: says this in the readme:

    The commoncrawl organization for crawling the web and making the dataset readily available. Even though we have our own crawler now, commoncrawl has been a huge help in the early stages of development.