
When I Rebuild the external mapping to goodreads, I see a JSON response
that includes only the most basic information like ISB numbers:
{
"identifiers": {
"goodreads": [
"211052256"
],
"isbn_10": [
"9400411723"
],
"isbn_13": [
"9789400411722"
],
"kindle_asin": [
"B0CZP9H9GJ"
]
}
}
However, the HTML page for the book includes a lot more information, like
the title, author, summary and cover image. The information is embedded in the <script id="__NEXT_DATA__" type="application/json">
tag in the page's HTML.
I don't know how stable the shape of this JSON is, I realize you can't do cross origin requests and that routing through your servers would get you severely rate limited or IP blocked by goodreads.
But if librarians could paste the HTML of the page into a form, and the information would be extracted from it, that would significantly reduce the workload.
An alternative would be a browser extension or tampermonkey script that would do this work client side?
Happy to help!
For another book imported from goodreads, a lot more info could be found on google books via the ISBN: https://www.googleapis.com/books/v1/volumes?q=isbn:9781800183117
Anyway, I'm sure you're well aware of this. It just made me curious what the syncing strategy is.
Is there any point for librarians to amend this information and manually add links to other book databases, or will they eventually be created automatically by some background process?