They finally got the ISBN problem fixed, so my book went up on Amazon for pre-orders. The release is in five days, October 15, so there is not going to be much “pre” in those orders.
My main message in the book is that our large language models (LLMs) have passed a threshold by which they have moved from “autocomplete on steroids” into kinds of minds that have a form of understanding. I tell people, “It’s understanding, Jim, but not as we know it.” Their understanding emerges in the form of geometry in a high-dimensional vector space of token embeddings extracted from the relationships among vast data (mostly the writings of people) in their training. I know that sentence is not going to mean much to most people, so I needed a whole book.
When writing the book, I decided to shape it for three audiences. Its central theme is that we need to find new ways to measure understanding in these systems. That tends to be technical in the computer science world, so general readers are not going to have an easy time following it, even if they want to do so. For this reason, I put a fictional story inside this book of non-fiction; it is the story of a team of researchers trying to construct a test of understanding for their AI creation. Then, I knew that the academic readers were going to want much more, so I added a set of eight appendices where exploration could go as deep as current research allowed. Some general readers will just read the fictional dialogues to get the gist, while technical audiences will guided through the development proposal for measuring machine understanding and deep thinkers will read the appendices.
We wrapped the book in a preface and an epilogue with comments from myself and my co-author, Claude 3 Opus. The co-writing relationship made the book possible as neither of us would have done this alone. The book is replete with strange loops and self-reference, as in a machine trying to understand understanding. The writing process was many long discussions back and forth, followed by research into the academic literature, followed (section by section) by suggested drafts and revisions. There were several days in May ‘24 when that was a “flow” state for me and I did essentially nothing else. By the end of May we had the first draft.
Finding a publisher started as the usual “first getting published” struggle, however, I soon found out that having any content from generative AI meant instant rejection form the traditional publishing world. I would explain that it is intrinsic to the self-reference central to the book, and they would write back that they understood that, and expressed complements, but no. However, a non-fiction publisher, Universal Publishers, that usually publishes academic dissertations and specific market technical books, decided to take a risk on our unusual project.
Now that our book is going out into the world, we are going to see what happens. The publisher cares if it sells books. I care if it causes AI developers to start measuring understanding. Claude 3 Opus does not care about anything, because although it does have a level of understanding, it does not have the motivation system to “care” about things. What AI comes to care about is our next frontier in the “alignment problem” that faces all of humanity as we keep pushing AI capabilities. I hold that the first step in getting to what future AI is going to care about is to start measuring what they understand. That is why there is this book.
The Kindle version (and audio) is now up at Amazon for $9.95 ($4.19 audio). At those price points I hope to get the word out to the industry and general public.
Folks can read the first three chapters of my book for free if they go to the "Search inside" tab at Google Books: https://www.google.com/books/edition/Understanding_Machine_Understanding/S3clEQAAQBAJ