• TropicalDingdong@lemmy.world
    link
    fedilink
    arrow-up
    2
    ·
    7 hours ago

    Its phenomenal. I have found a few places where it falls down, and its usually when the text is incredibly small. You can see its being down sampled before it gets handed off to the model. It falls down on like, one example I found, some bank disclosure documentation from bank of america:

    It just came out as all I’s and o’s.

    For the emails, book text, letters, etc… I genuinely haven’t found a place it didn’t work correctly as I’ve been spot checking the output.

    If you have colab you can just try the script I put up. All you need to do to have it run is to book mark the house oversite committee google drive folder to your local google drive.