lynnesbian.space/_projects/ocrbot.md

1.4 KiB

name description source
OCRbot A Fediverse bot that, when invoked, replies with the text content of an image using optical character recognition (OCR). https://github.com/Lynnesbian/OCRbot

Many social media platforms don't allow users to caption their images, meaning that the vision impaired and other people incapable of viewing the image(s) in a post are completely unable to interact with it. The Fediverse is different, with many major implementations (Mastodon, Pleroma) supporting captioning images, or at least viewing the captions on existing images. However, this requires that the users actually caption the images themselves, which many don't. This is where OCRbot comes in - you can tag it either in your own post, or tag it in the replies of someone else's post, and it'll use Optical Character Recognition to automatically transcribe the text from the image.

It's worth noting that a few months after OCRbot became popular, an update to Mastodon added an embedded OCR feature to allow users to caption their images from the post creation window. This doesn't make OCRbot entirely obsolete - most instances have the maximum caption length set to something very low, while I host a version of OCRbot on fedi.lynnesbian.space, which has a character limit of 65,535. OCRbot also tends to reply faster than Mastodon's embedded OCR feature works.