They can also crawl this publically-accessible social media source for their data sets.
Crawling would be silly. They can simply setup a lemmy node and subscribe to every other server. Activitypub crawler would be much more efficient as they wouldn’t accidentally crawl things that haven’t changed, but instead can read the activitypub updates.
Crawling would be silly. They can simply setup a lemmy node and subscribe to every other server. Activitypub crawler would be much more efficient as they wouldn’t accidentally crawl things that haven’t changed, but instead can read the activitypub updates.